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METHODS OF DIAGNOSIS OF LUNG CANCER, COMPOSITIONS AND METHODS 
OF SCREENING FOR MODULATORS OF LUNG CANCER 



CROSS-REFERENCES TO RELATED APPLICATIONS 

This appUcation is related to USSN 60/284,770, filed April 18, 2001; USSN 
60/290,492, filed May 10, 2001 ; USSN 60/334,370, filed November 29, 2001 ; USSN 
60/339,245, filed November 9, 2001; USSN 60/350,666, filed November 13, 2001; and 
10 USSN 60/xxx^x, filed April 12, 2002 (Docket OMNI-002P); each of which is incorporated 
herein by reference in its entirety. 

FIELD OF THE INVENTION 
The invention relates to the identification of nucleic acid and protein expression 
15 profiles and nucleic acids, products, and antibodies thereto that are involved in lung cancer; 
and to the use of such expression profiles and compositions in diagnosis and therapy of lung 
cancer. The invention fiirther relates to methods for identifying and using agents and/or 
targets that inhibit lung cancer or related conditions. 

20 BACKGROUND OF THE INVENTION 

Lung cancer is the second most commonly occurring cancer in the United States and 
is the leading cause of cancer-related death. It is estimated that there are over 160,000 new 
cases of limg cancer in the United States every year. Of those who are diagnosed with lung 
cancer, 86 percent will die within five years. Lung cancer is the most common visceral 

25 cancer in men and accounts for nearly one third of all cancer deaths in both men and womra. 
In fact, lung cancer accounts for 7% of all deaths, due to any cause, in both men and women. 

Smokmg is the primary cause of lung cancer, with more than 80% of lung cancers 
resulting &om smoking. About 400 to 500 separate gaseous substances are present in the 
smoke of a non-filter cigarette. The most noteworthy substances include nitrogen oxides, 

30 hydrogen cyanide, formaldehyde, benzene, and toluene. The particles present in cigarette 
smoke contain at least 3,500 individual compounds such as nicotine, tobacco alkaloids 
(nomicotine, anatabine, anabasine), polycyclic aromatic hydrocarbons (e.g., benzo(a)pyrene, 
B(a)P), naphthalenes, aromatic amines, phenols, and tobacco-specific nitrosamines. 
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Tobacco-specific nitrosamines are formed during tobacco curing and processing, and 
are suspected of causing lung cancer in humans. In rodent studies, regardless of the where or 
how it is ^plied, the tobacco-specific nitrosamine known as NNK produces lung adenomas 
and lung adenocarcinomas. The tobacco-specific nitrosamine known as NNAL also produces 
S lung adenocarcinomas in rodents. 

Many of the chemicals found in cigarette smoke also affect the nonsmoker inhaling 
"secondhand" or sidestream smoke. Indeed, the smoke inhaled by non-smokers has a 
chemical composition similar to the smoke inhaled by smokers, but, importantly, the 
concentrations of the carcinogenic tobacco-specific nitrosamines are present in higher 
10 concentrations in second hand smoke. For this and other reasons, "passive smoking" is an 
important cause of lung cancer, causing as many as 3,000 lung cancer deaths in nonsmokers 
each year. 

In addition to smokmg, other factors thought to be causes of lung cancer include on- 
the-job exposure to carcinogens such as asbestos and uraniimi, exposure to chemical hazards 

15 such as radon, polycyclic aromatic hydrocarbons, chromium, nickel, and inorganic arsenic, 
genetic factors, and diet 

Histological classification of various lung cancers define the types of cancer that 
begin in the lung. See, e.g., Travis, et al. (1999) Histological Typing of Lu ng and Pleural 
Tumours (International Histological Classification of Tumours, No 1. Four major cell types 

20 make up more than 88% of all primary lung neoplasms. These are: squamous or epidermoid 
carcinoma, small cell (also called oat cell) carcinoma, adenocarcinoma, and large cell (also 
called large cell anaplastic) carcinoma. The remainder include undifferentiated carcinomas, 
carcinoids, bronchial gland tumors, and other rarer types. The various cell types have 
different natural histories and responses to therapy, and, thus, a correct histologic diagnosis is 

25 the first step of effective treatment. 

Small cell lung cancer (SCLC) accoimts for 18-25% of all lung cancers, and occurs 
less frequently than non-small cell limg cancers, and generally spread to distant organs more 
rapidly than non-small cell lung cancer. In general, at the time of presentation small cell lung 
cancers have akeady spread beyond the beyond the bounds where surgery and ci^ative intent 

30 can be undertaken. Hoever, if identified early enough, these cancers are often responsive to 
chemotherapy and thoracic radiation treatment. 

Non-small cell lung cancers (NSCLC) are the more frequently occurring form of lung 
cancer. They comprise squamous cell carcinoma, adenocarcinoma, and large cell carcinoma 
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and account for more than 7S% of all lung cancers. Non-small cell tumors that are localized 

at the time of presentation can sometimes be cur^ with surgery and/or radiotherapy, but 

usually are not identified until significant metastasis has occmred, which are typically not 

very responsive to surgical, chemothCTapy, or radiation treatment. 

5 The screening of asymptomatic persons at higih risk for lung cancer has often proven 

ineffective. In general, only 5 to 15 percent of lung cancer patients have their disease 

detected while they are asymptomatic. Of course, early detection and treatment are critical 

factors in the fight against lung cancer. The average survival rate is 49% for those whose 

cancer is detected early, before the cancer has spread from the lung. Lung cancer often 

10 spreads outside of the lung, and it may have spread to the bones or brain by the time it is 
diagnosed. While the prognosis may be better for lung cancers that are detected early, 
because of the lack ofv effective curative treatments, early detection does not necessarily alter 
the total death rate from lung cancer. 

Thus, methods for diagnosis and prognosis of lung cancer and effective treatment of 

15 lung cancer would be desirable. Accordingly, provided herein are methods that can be used 
in diagnosis and prognosis of lung cancer. Further provided are methods that can be used to 
screen candidate therapeutic agents for the ability to modulate, e.g., treat, lung cancer. 
Additionally, provided herein are molecular targets and compositions for therapeutic 
intervention in lung disease and other metastatic cancers. 

20 

SUMMARY OF THE INVENTION 
The present invention provides nucleotide sequences of genes that are up- and down- 
regulated in lung cancer cells. Such genes are usefiil for diagnostic purposes, and also as 
targets for screening for therapeutic compounds that modulate lung cancer, such as 

25 antibodies. The methods of detecting nucleic acids of the invention or their encoded proteins 
can be used for a number of purposes. Examples include early detection of lung cancers, 
monitoring and early detection of relapse following treatment of lung cancers, monitoring 
response to therapy of lung cancers, determining prognosis of lung cancers, directing therapy 
of lung cancers, selecting patients for postoperative chemotherapy or radiation therapy, 

30 selecting therapy, determining tumor prognosis, treatment, or response to treatment, and early 
detection of precancerous lesions of the lung. Examples of benign or precancerous lesions 
include: atelectasis, enq)hysema, brochitis, chronic obstmctive pulmonary disease, fibrosis, 
hypersensitivity pneumonitis (HP), interstitial puhnonary fibrosis (IPF), asthma, and 
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biondiiectasis. Other aspects of the invention will become apparent to the skilled artisan by 
the following description of the invention. 

In one aspect, the present invention provides a method of detecting a lung cancer- 
associated transcript in a cell ftom a patient, the method comprising contacting a biological 
5 sample from the patient with a polynucleotide that selectively hybridizes to a sequence at 
least 80% identical to a sequence as shown in Tables 1 A-16. Alternatively, the sample may 
be contacted with a specific binding reagent, e.g., antibody. 

In one embodiment, the polynucleotide selectively hybridizes to a sequence at least 
95% identical to a sequence as shown in Tables lA-16. In another embodiment, the 
10 polynucleotide comprises a sequence as shown in Tables 1 A-16. 

In one embodiment, the biological sample is a tissue sample, or a body fluid. In 
anotho* embodiment, the biological sample comprises isolated nucleic acids, e.g., mRNA. 

In one embodiment, the polynucleotide is labeled, e.g., with a fluorescent label. In 
one embodiment, the polynucleotide is immobilized on a solid surface. In one embodiment, 
15 the patient is undergoing a ther^eutic regimen to treat lung cancer. In another embodiment, 
the patient is suspected of having lung cancer. In one embodiment, the patient is a primate, 
e.g., a human. 

In one embodiment, the mefliod further comprises the step of amplifying nucleic acids 
before the step of contacting the biological sample with the polynucleotide. 

20 In another aspect, the present invention provides a method of monitoring the efficacy 

of a therapeutic treatment of lung cancer, the method comprising the steps of: (i) providing a 
biological sample from a patient undergoing the therapeutic treatment; and (ii) determining 
the level of a limg cancer-associated transcript in the biological sample by contacting the 
biological sample with a polynucleotide that selectively hybridizes to a sequence at least 80% 

25 identical to a sequence as shown in Tables .1 A-16, thereby monitoring the efficacy of tiie 
therapy. Or the sample may be evaluated for protein, e.g., contacting the sample with an 
antibody. 

In one embodiment, the method further comprises the step of: (iii) comparing the 
level of the lung cancer-associated transcript to a level of the lung cancer-associated 
30 transcript in a biological sample from the patient prior to, or earlier in, the therapeutic 
treatment. Or the sample may be evalated for comparison of protein. 

In another aspect, the present invention provides a method of monitoring the efficacy 
of a therapeutic treatment of lung cancer, the method comprising the steps of: (i) providing a 



4 



wo 02/086443 PCT/US02/12476 

biological sample from a patient undergoing the th^apeutic treatment; and (ii) detemiining 
the level of a lung cancer-associated antibody in the biological sample by contacting the 
biological sample with a polypeptide encoded by a polynucleotide that selectively hybridizes 
to a sequence at least 80% identical to a sequence as shown in Tables .1 A-16, wherein the 
5 polypeptide specifically binds to the lung cancer-associated antibody, thereby monitoring the 
efficacy of the therapy. 

In one embodiment, the method further comprises the step of: (iii) comparing the 
level of the lung cancer-associated antibody to a level of the lung cancer-associated antibody 
in a biological sample from the patient prior to, or earlier in, the ther^eutlc treatment 

10 In another aspect, the present invention provides a method of monitoring the efficacy 

of a therapeutic treatment of lung cancer, the method comprising the stq>s of: (i) providing a 
biological sample from a patient undergoing the ther^eutic treatment; and (ii) determining 
the level of a lung cancer-associated polypeptide in the biological sample by contacting the 
biological sample with an antibody, wherein the antibody specifically binds to a polypeptide 

15 encoded by a polynucleotide that selectively hybridizes to a sequence at least 80% identical 
to a sequence as shown in Tables 1 A-16, thereby monitoring the efficacy of the flierapy. 

In one embodiment, the method further comprises the step of: (iii) comparing the 
level of the lung cancer-associated polypeptide to a level of the lung cancer-associated 
polypeptide in a biological sample from the patient prior to, or earlier in, the therapeutic 

20 treatment. In one aspect, the present invention provides an isolated nucleic acid molecule 
consisting of a polynucleotide sequence as shown in Tables 1 A-16. In one ^bodiment, an 
expression vector or cell comprises the isolated nucleic acid. In one aspect, the present 
invention provides an isolated polypq)tide which is encoded by a nucleic acid molecule 
having polynucleotide sequence as shown in Tables 1 A-16. 

25 In another aspect, the present invention provides an antibody that specifically binds to 

an isolated polypeptide which is encoded by a nucleic acid molecule having polynucleotide 
sequence as shown in Tables 1 A-16. In one embodiment,' the antibody is conjugated to an 
effector component, e.g., a fluorescent label, a radioisotope or a c3rtotoxic chemical, la one 
embodiment, the antibody is an antibody fragment La another embodiment, the antibody is 

30 himianized. 

In one aspect, the present invention provides a method of detecting lung cancer in a a 
patient, the method corrq)rising contacting a biological sample Aom the patient with an 
antibody or protein as described herein. 
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In another aspect, the present invention provides a method of detecting antibodies 
spedfic to a lung cancer gene in a patient, the method comprising contacting a biological 
sample fiom the patient with a polypeptide encoded by a nucleic add comprises a sequence 
fiom Tables lA-16. 

5 In another aspect, the present invention provides a mefliod for identifying a compound 

that modulates a lung cancer-associated polypeptide, the method conq)rising the steps of: (i) 
contacting the compound with a lung cancer-associated polypeptide, the polypeptide encoded 
by a polynucleotide that selectively hybridizes to a sequence at least 80% identical to a 
sequence as shown in Tables lA-16; and (ii) determining the functional effect of the 

1 0 conq)ound upon the polypeptide. 

In one embodiment, the ftmctional effect is a physical effect, an enzymatic effect, or a 
chemical effect. In one embodiment, the polypeptide is expressed in a eukaryotic host ceU or 
ceU membrane, another embodiment, the polypq)tide is recombiiwnt In one 
embodiment, the functional effect is detennined by measuring Ugand binding to ttie 

15 polypeptide. 

In another aspect, the present invention provides a method of inhibiting proliferation 
or another critical process of a lung cancer-associated cell to treat lung cancer in a patient, the 
method comprising the step of administering to the subject a ther^euticaUy effective amount 
of a compound identified as described herein. In one embodimait, tiie compound is an 
20 antibody. 

In another aspect, the present invention provides a drug screoiing assay comprising 
the steps of: (i) administering a test compound to a mammal having lung cancer or a cell 
isolated therefrom; (ii) comparing the level of gene expression of a polynucleotide tiiat 
selectively hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 
25 1 A-16 in a treated ceU or mammal wifli the level of gene expression of the polynucleotide in 
a control cell or mammal, wherein a test compound fliat modulates tiie level of expression of 
the polynucleotide is a candidate for the treatment of lung cancer. 

In one embodunent, tfie control is a mammal with lung cancer or a cell therefrom fliat 
has not been treated with the test conqwund. In anotho: embodiment, the control is a normal 
30 cell or mammal, or a non-mahgnant lung disease. 

In another aspect, the present invention provides a method for treating a mammal 
having lung cancer comprising administering a compound identified by the assay described 
herdn. 
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In another aspect, the present invention provides a pharmaceutical composition for 
treating a mammal having lung cancer, the composition comprismg a compound idraitified by 
the assay described herein and a physiologically acceptable excipient 

5 DETAILED DESCRIPTION OF THE INVENTION 

In accordance with the objects outlined above, the present invention provides novel 
methods for diagnosis and treatment of lung disease or cancer, as well as methods for 
screening for compositions which modulate lung cancer. 'Treatment, monitoring, detection 
or modulation of lung disease or cancer"' includes treatment, monitoring, detection, or 

10 modulation of lung disease in those patients who have lung disease (whether mahgnant or 
non-malignant, e.g., emphysema, bronchitis, or fibrosis) as well as patients with lung cancers 
in which gene expression from a gene in Tables 1 A-16 is increased or decreased, indicating 
that the subject is more likely to have disease. In particular,while these targets are identified 
primarily from lung cancer samples, these same targets are likely to be similarly found in 

15 analyses of other medical conditions. These other conditions may result from similar 
pathological processes which affect similar tissues, e.g., lung cancer, small cell lung 
carcinoma (oat cell carcinoma), non-small cell carcmomas (e.g., squamous cell carcinoma, 
adenocarcinoma, large cell lung carcinoma, carcinoid, granulomatous), fibrosis (idiopathic 
puhnonary fibrosis (IPF), hypersensitivity pneumonitis (HP), interstitial pneumonitis, 

20 nonspecific idiopathic pneumonitis (NSLP)), chronic obstructive puhnonary disease (COPD, 
e.g., emphysema, chronic bronchitis), asthina, bronchiectasis, and esophageal cancer. See, 
e.g„ the NCI webpage and USSN 60/347,349 and USSN 60/xxxpcxx (docket LFBR-OOl-lP, 
filed March 29, 2002), each of which is incorporated herein by reference. The treatment may 
be of lung cancer or related condition itself or treatment of metastasis. 

25 In particular, identification of markers selectively expressed on these cancers allows 

for use of that expression in diagnostic, prognostic, or therapeutic methods. As such, the 
invention defines various compositions, e.g., nucleic acids, polypeptides, antibodies, and 
small molecule agonists/antagonists, which will be usefiil to selectively identify those 
markers. For example, therapeutic methods may take the form of protein therapeutics which 

30 use the marker expression for selective localization or modulation of function (for those 
markers which have a causative disease effect), for vaccines, identification of binding 
partners, or antagonism, e.g., using antisense or RNAi. The markers may be useM for 
molecular characterization of subsets of lung diseases, which subsets may actually require 
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very differait treatments. Moreovo:, the markers may also be inq>ortant in related diseases to 
the specific canc^, e.g., which affect similar tissues in non-malignant diseases, or have 
similar mechanisms of induction/maint^iance. Metastatic processes or characteristics may 
also be targeted- Diagnostic and prognostic uses are made available, e.g., to subset related 
5 but distinct diseases, or to determine treatment strategy. The detection methods may be based 
upon nucleic acid, e.g., PGR or hybridization techniques, or protein, e.g., ELISA, imaging, 
IHC, etc. The diagnosis may be qualitative or quantitative, and may detect increases or 
decreases in expression levels. 

Tables IA-16 provide unigene cluster identification numbers for the nucleotide 

10 sequence of genes that exhibit increased or decreased expression in lung cancer samples. The 
tables also provide an exemplar accession number that provides a nucleotide sequence that is 
part of the unigene cluster. &i Table lA, genes marked as 'target 1" or **target 2" are 
particularly usefixl as therapeutic targets. Genes marked as **target 3" are particularly usefiil 
as diagnostic markers. Genes marked as "chron" are upregulated in chronically diseased lung 

15 (e.g., emphysema, bronchitis, fibrosis) relative to lung tumors and normal tissue. In certain 
analyses, the ratio for the "chron" category was determined using the 70th percentile of 
chronically diseases lung samples divided by the 90th percentile of normal lung samples. 
The ratio for the targets was determined using the 70th percentile of lung tumor samples 
divided by the 90th percentile of normal lung samples. 

20 

Definitions 

The term "lung cancer protein" or "lung cancer polynucleotide" or "lung cancer- 
associated transcript" refers to nucleic acid and polypeptide polymorphic variants, alleles, 
mutants, and interspecies homologs that: (1) have a nucleotide sequence that has greater than 

25 about 60% nucleotide sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 
92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% or greater nucleotide sequence identity, 
preferably over a region of over a region of at least about 25, 50, 100, 200, 500, 1000, or 
more nucleotides, to a nucleotide sequence of or associated with a unigene cluster of Tables 
1 A-16; (2) bind to antibodies, e.g., polyclonal antibodies, raised against an immunogen 

30 comprising an amino acid sequence encoded by a nucleotide sequence of or associated with a 
imigene cluster of Tables lA-16, and conservatively modified variants thereof; (3) 
specifically hybridize under stringent hybridization conditions to a nucleic acid sequence, or 
the complement thereof of Tables lA-16 and conservatively modified variants thereof; or (4) 
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have an amino acid sequence that has greater than about 60% amino acid sequence identity, 
65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 
or 99% or greater amino sequence identity, preferably over a region of over a region of at 
least about 25, 50, 100, 200, 500, 1000, or more amino acid, to an amino acid sequence 
5 encoded by a nucleotide sequence of or associated with a unigene cluster of Tables 1 A-16. A 
polynucleotide or polypeptide sequence is typically from a mammal including, but not 
limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or 
other mammal. A "Ixmg cancer polypeptide" and a "lung cancer polynucleotide," include 
both naturally occurring or recombinant forms. 

10 A "full length" lung cancer protein or nucleic acid refers to a lung cancer polypeptide 

or polynucleotide sequence, or a variant th^eoj^ that contains the elements normally 
contained in one or more naturally occurring, wild type lung cancer polynucleotide or 
polypeptide sequences. The "full length" may be prior to, or after, various stages of post- 
translational processing or splicing, including alternative splicing. 

15 "Biological sample" as used herein is a sample of biological tissue or fluid that 

contains nucleic acids or polypeptides, e.g., of a lung cancer protein, polynucleotide, or 
transcript. Such samples include, but are not limited to, tissue isolated from primates, e.g., 
humans, or rodents, e.g., mice, and rats. Biological samples may also include sections of 
tissues such as biopsy and autopsy samples, frozen sections taken for histologic purposes, 

20 archival materials, blood, plasma, serum, sputum, stool, tears, mucus, hair, skin, etc. 
Biological samples also include explants and primary and/or transformed cell cultures 
derived from patient tissues. A biological sample is typically obtained from a eukaryotic 
organism, most preferably a mammal such as a primate, e.g., chimpanzee or human; cow; 
dog; cat; a rodent, e.g., guinea pig, rat, mouse; rabbit; or other mammal; or a bird; reptile; 

25 fish. Livestock and domestic animals are of interest. 

'Troviding a biological sample" means to obtain a biological sample for use in 
methods described in this invention. Most often, this will be done by removing a sample of 
cells from an animal, but can also be accomplished by using previously isolated cells (e.g., 
isolated by another person, at another time, and/or for another purpose), or by performing the 

30 methods of the invention in vivo. Archival tissues or materials, having treatment or outcome 
histoiy, will be particularly useful. 

The terms "identical" or percent "identity," in the context of two or more nucleic 
acids or polypeptide sequences, refer to two or more sequences or subsequences that are the 
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same or have a specified percentage of amino acid residues or nucleotides ttiat are flie same 
(e.g., about 60% identity, preferably 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 
95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and 
aligned for mflYimuTn correspondence over a comparison window or designated region) as 
5 measured using, e.g., a BLAST or BLAST 2.0 sequence comparison algorithms with default 
parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI 
web site http://www,ncbi.nlm.nih.gov/BLAST/ or the like). Such sequences are then said to 
be "substantially identical." This definition also refers to, or may be ^plied to, the 
complement of a test sequence. The definition also includes sequences that have deletions 
10 and/or insertions, substitutions, and naturally occurring, e.g., polymorphic or allelic variants, 
and man-made variants. As described below, the preferred algorithms can account for gaps 
and the Uke. Preferably, identity exists over a region that is at least about 25 amino acids or 
nucleotides in length, or more preferably over a region that is 50-100 amino acids or 
nucleotides in length. 

15 For sequence comparison, typically one sequence acts as a reference sequence, to 

which test sequences are compared. When using a sequence comparison algorithm, test and 
reference sequences are entered into a computer, subsequence coordinates are designated, if 
necessary, and sequence algorithm program parameters are designated. Preferably, default 
program parameters can be used, or alternative parameters can be designated. The sequence 

20 comparison algorithm then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

A "comparison window*', as used herein, includes reference to a segment of 
contiguous positions selected from the group consisting typically of from 20 to 600, usually 
about 50 to about 200, more usually about 100 to about 150 in which a sequence may be 

25 compared to a reference sequence of the same number of contiguous positions after the two 
sequences are optimally aligned. Methods of alignment of sequences for comparison are 
well-known in the art. Optimal alignment of sequences for comparison can be conducted, 
e.g., by the local homology algorithm of Smith and Waterman (1981) Adv. AdpI. Math. 
2:482, by the homology alignment algorithm of Needleman and Wunsch (1970) J. Mol. BioL 

30 48:443, by the search for similarity method of Pearson and Lipman (1988) Proc. NatM. AcacL 
Sci. USA 85:2444, by computerized implementations of these algorithms (GAP, BESTFIT, 
FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer 
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Group, 575 Science Dr., Madison, WI), or by manual alignmait and visual inspection (see, 
e.g., Ausubel, et al. (eds. 1995 and supplements) Current Protocols in Molecular Biology . 

Preferred examples of algorithms that are suitable for determining percent sequmce 
identity and sequence similarity include the BLAST and BLAST 2.0 algorithms, which are 
5 described in Altschul, et al. (1977) Nuc. Acids Res. 25:3389-3402 and Altschul, et aL i(1990) 
J. Mol. Biol. 215:403-410. BLAST and BLAST 2.0 are used, with the parameters described 
herein, to determine percent sequence identity for the nucleic acids and proteins of the 
invention. Software for performing BLAST analyses is publicly available through the 
National Center for Biotechnology Information (http.7/www.ncbi.nhn.nih.gov/). This 

10 algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive-valued 
threshold score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul, et al., supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing 

15 them. The word hits are extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calcxilated using, e.g., 
for nucleotide sequences, the parameters M (reward score for a pair of matching residues; 
always > 0) and N (penalty score for mismatching residues; always < 0), For amino acid 
sequences, a scoring matrix is used to calculate the cmnulative score. Extension of the word 

20 hits in each direction are halted when: the cumulative alignment score falls offby the 

quantity X firom its maximum achieved value; the cumulative score goes to zero or below, 
due to the accumulation of one or more negative-scoring residue alignments; or the end of 
either sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) 

25 uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, M=5, N=-4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 
defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff and HenikoflF(1989) Proc. Natl Acad. Sci. USA 89:10915) alignments (B) of 
50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. 

30 The BLAST algorithm also performs a statisticsd analysis of the similarity between 

two sequences (see, e.g., Karlin and Altschul (1993) Proc. NatM. Acad. Sci. USA 90:5873- 
5787). One measure of similarity provided by the BLAST algorithm is the smallest sum 
probability (P(N)), which provides an indication of the probability by which a match between 
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two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid 
is considered similar to a reference sequence if the smallest sum probability in a comparison 
of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably 
less than about 0.01, and most preferably less than about 0.001. Log values may be negative 
5 large numbers, e.g., 5, 10, 20, 30, 40, 40, 70, 90, 110, 150, 170, etc. 

An indication that two nucleic acid sequences are substantially identical is that the 
polypeptide encoded by the first nucleic acid is immunologically cross reactive with the 
antibodies raised against the polypeptide encoded by the second nucleic acid. Thus, a 
polypeptide is typically substantially identical to a second polypeptide, e.g., where the two 
1 0 peptides differ only by conservative substitutions. Another indication that two nucleic acid 
sequences are substantially identical is that the two molecules or their complements hybridize 
to each other under stringent conditions. Yet another indication that two nucleic acid 
sequences are substantially identical is that the same primers can be used to amplify the 
sequences. 

15 A "host cell" is a naturally occurring cell or a transformed cell that contains an 

expression vector and supports the replication or expression of the expression vector. Host 
cells may be cultured cells, e}q)lants, cells in vivd and the like. Host cells may be 
prokaryotic cells such as E, colU or eukaryotic cells such as yeast, insect, amphibian, or 
mammalian cells such as CHO, HeLa, and the like (see, e.g., the American Type Culture 

20 Collection catalog or web site, www.atcc.org). 

The tenns "isolated," **purified," or ^^biologically pure" refer to material that is 
substantially or essentially firee from components that normally accompany it as foimd in its 
native state. Purity and homogeneity are typically determined using analytical chemistry 
techniques such as polyacrylamide gel electrophoresis or high performance liquid 

25 chromatography. A protein or nucleic acid that is the predominant species present in a 

preparation is substantially purified. In particular, an isolated nucleic acid is separated from 
some open reading frames that naturally flank the gene and encode proteins other than protein 
encoded by the gene. The term "purified" in some embodiments denotes that a nucleic acid 
or protein gives rise to essentially one band in an electrophoretic gel. Preferably, it means 

30 that the nucleic acid or protein is at least about 85% pure, more preferably at least 95% pure, 
and most preferably at least 99% pure. *Turify" or '^purification" in other embodunents 
means removmg at least one contaminant or component from the composition to be purified. 
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In this sense, purification does not require that the purified compound be homogeneous, e.g., 

100% pure. 

The terms "polypeptide," '^peptide" and '*protein" are used intercliangeably herein to 
refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which 
S one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally 
occurring amino acid, as well as to naturally occurring amino acid polymers, those containing 
modified residues, and non-naturally occurring amino acid polymer. 

The term "amino acid" refers to naturally occurring and synthetic amino acids, as well 
as amino acid analogs and amino acid mimetics that function similarly to the naturally 

10 occurring anuno acids. Naturally occurring amino acids are those encoded by the genetic 
code, as well as those amino acids that are later modified, e.g., hydroxyproline, y- 
caiboxyglutamate, and 0-phosphoserine. Amino acid analogs refer to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, e.g., an a carbon that is 
bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, 

IS norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs may have 
modified R groups (e.g., norleucine) or modified peptide backbones, but retain some basic 
chemical structure as a naturally occurring amino acid. Amino acid mimetics refer to 
chemical compounds that have a stmcture that is different firom the general chemical 
stracture of an amino acid, but that function similarly to another amino acid. 

20 Amino acids may be referred to herein by either their commonly known three letter 

symbols or by the one-letter symbols recommended by the lUPAC-IUB Biochemical 
Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly 
accepted single-letter codes. 

"Conservatively modified variants" applies to both amino acid and nucleic acid 

25 sequences. With respect to particular nucleic acid sequ^ces, conservatively modified 
variants refers to those nucleic acids which encode identical or essentially identical amino 
acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical or associated, e.g., naturally contiguous, sequences. Because of the 
degeneracy of the genetic code, a large number of functionally identical nucleic acids encode 

30 most proteins. For instance, the codons GCA, GCC, GCG, and GCU each encode the amino 
acid alanine. Thus, at each position where an alanine is specified by a codon, the codon can 
be altered to another of the corresponding codons described without altering the encoded 
polypeptide. Such nucleic acid variations are "silent variations," which are one species of 
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conservatively modified variations. Ev^ nucleic acid sequence Herein which aicodes a 
polypeptide also describes silent variations of the nucleic acid. In certain contexts eadi 
codon in a nucleic acid (excq>t AUG, which is ordinarily the only codon for methionine, and 
TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a 
5 functionally similar molecule. Accordingly, a silent variation of a nucleic acid which 
encodes a polypeptide is implicit in a desoibed sequence with respect to the expression 
product, but not necessarily witii respect to actual probe sequences. 

As to antino acid sequences, one of skill will recognize that individual substitutions, 
deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which 

10 alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded 
sequence is a "conservatively modified variant" where the alteration results in the substitution 
of an amino acid with a chemically sunilar amino acid. Conservative substitution tables 
providing fimctionally similar amino acids are well known in the art. Such conservatively 
modified variants are in addition to and do not exclude polymorphic variants, interspecies 

IS homologs, and alleles of the invention. Typically conservative substitutions include for one 
another: 1) Alanine (A), Glycine (G); 2) Aspartic aicid (D), Glutamic acid (E); 3) Asparagtne 
(N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine 
(M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S), 
Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Creighton, Proteins (1984)). 

20 Macromolecular structures such as polypeptide structures can be described in terms of 

various levels of organization. For a general discussion of this organization, see, e.g., 
Alberts, et al. (1994) Molecular Bioloev of the Cell (3"^ ed.) and Cantor and Schimmel (1980) 
Biophysical Chemistrv Part I: The Conformation of Biological Macromolecules . **Primaiy 
structure" refers to the amino acid sequence of a particular peptide. ^'Secondary structure" 

25 refers to locally ordered, three dimensional structures within a polypeptide. These structures 
are commonly known as domains. Domains are portions of a polypeptide that ofiien form a 
compact imit of the polypeptide and are typically 25 to approximately 500 amino acids long. 
Typical domains are made up of sections of lesser organization such as stretches of P-sheet 
and a-helices. 'Tertiary structure" refers to the complete three dimensional stmcture of a 

30 polypeptide monomer. "Quaternary structure" refers to the three dimensional structure 
formed, usually by the noncovalent association of independent tertiary units. Anisotropic 
terms are also known as energy terms. 
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'•Nucleic acid" or "oligonucleotide" or **polynucleotide" or grammatical equivalents 
used herein means at least two nucleotides covalently linked together. Oligonucleotides are 
typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more nucleotides in length, up 
to about 100 nucleotides in length. Nucleic acids and polynucleotides are a polymers of any 
5 length, including longer lengths, e.g., 200, 300, 500, 1000, 2000, 3000, 5000, 7000, 10,000, 
etc. A nucleic add of the present invention will generally contain phosphodiester bonds, 
although in some cases, nucleic acid analogs are included that may have at least one different 
linkage, e.g., phosphoramidate, phosphorothioate, phosphorodithioate, or O- 
methylphophoioamidite linkages (see Eckstein (1992) Oligonucleo tides and Analogues: A 

10 Practical Approach Oxford University Press); and pq)tide nucleic acid backbones and 
linkages. Other analog nucleic acids include those with positive backbones; non-ionic 
backbones, and non-ribose backbones, including those, described in U.S. Patent Nos. 
5,235,033 and 5,034,506, and Chapters 6 and 7, in Sanghui and Cook, eds. CaAohvdrate 
Modifications in Antisense Research, ASC Symposium Series 580. Nucleic acids containing 

15 one or more carbocyclic sugars are also included witiiin one definition of nucleic acids. 

Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to 
increase the stability and half-life of such molecules in physiological environments or as 
probes on a biochip. Mixtures of naturally occurring nucleic acids and analogs can be made; 
alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring 

20 nucleic acids and analogs may be made. 

Particularly preferred are peptide nucleic acids (PNA) which includes peptide nucleic 
acid analogs. These backbones are substantially non-ionic under neutral conditions, in 
contrast to the highly charged phosphodiester backbone of naturally occurring nucleic acids. 
This results in two advantages. First, the PNA backbone exhibits improved hybridization 

25 kinetics. PNAs have larger changes in the melting temperature (T^) for mismatched versus 
perfectly matched basepairs. DNA and RNA typically exhibit a 2-4** C drop in Tm for an 
internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9° C. 
Similarly, due to their non-ionic nature, hybridization of the bases attached to these 
backbones is relatively insensitive to salt concentration. In addition, PNAs are not degraded 

30 by cellular enzymes, and thus can be more stable. 

The nucleic acids may be single stranded or double stranded, as specified, or contain 
portions of both double stranded or single stranded sequence. As will be appreciated by fliose 
in the art, the depiction of a single strand also defines the sequence of the complementary 
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Strand; thus flie sequences described herein also provide the complement of flie sequence. 
The nucleic acid may be DNA, bofli genomic and cDNA, KNA, or a hybrid, where the 
nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations 
of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine 

5 hypoxanthine, isocytosine, isoguanine, etc, 'Transcript*' typically refers to a naturally 
occurring RNA, e.g., a pre-mKNA, hnKNA, or mRNA. As used herein, the term 
**nucleoside" includes nucleotides and nucleoside and nucleotide analogs, and modified 
nucleosides such as amino modified nucleosides. In addition, '^nucleoside" includes non- 
naturally occurring analog structures. Tlius, e.g., flie individual units of a pq)tide nucleic 

10 acid, each containing a base, are referred to herein as a nucleoside, 

A "label" or a "detectable moiety" is a composition detectable by spectroscopic, 
photochemical, biochemical, immunochemical, physiological, chemical, or other physical 
means. For example, usefial labels mclude fluorescent dyes, electron-dense reagents, 
enzymes (e.g., as commonly used m an ELISA), biotin, digoxigenin, or haptens and protems 

15 or other entities which can be made detectable, e.g., by incorporatmg a radiolabel into the 
peptide or used to detect antibodies specifically reactive with the peptide. The labels may be 
incorporated into the cancer nucleic acids, proteins, and antibodies. Many methods known in 
the art for conjugating the antibody to the label may be employed, including those methods 
described by Hunter, et al. (1962) Nature 144:945; David, et al. (1974) Biochemistry 

20 13:1014-1021; Pain, et al (1981) J. Immunol. Meth.. 40:219-230; andNygren (1982) L 
Histochem. and Cvtochem. 30:407-412. 

An "effector" or "effector moiety" or "effector component" is a molecule that is 
bound (or linked, or conjugated), either covalently, through a linker or a chemical bond, or 
noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds, to an antibody. 

25 The "effector*' can be a variety of molecules including, e.g., detection moieties includmg 

radioactive compounds, fluorescent compounds, an enzyme or substrate, tags such as epitope 
tags, a toxin; activatable moieties, a chemotherapeutic agent; a lipase; an antibiotic; or a 
radioisotope emitting *'hard" e-g., beta radiation. 

A "labeled nucleic acid probe or oligonucleotide" is one that is bound, either 

30 covalenfly, through a linker or a chemical bond, or noncovalently, through ionic, van der 
Waals, electrostatic, or hydrogen bonds to a label such that the presence of the probe may be 
detected by detecting the presence of the label bound to the probe. Alternatively, method 
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using high affinity interactions may achieve the same results where one of a pair of binding 
partners binds to the other, e.g., biotin, streptavidin. 

As used herein a **nucleic acid probe or oligonucleotide" is a nucleic acid capable of 
binding to a target nucleic acid of complementary sequence through one or more types of 
5 chemical bonds, usually through complementary base pairing, e.g., through hydrogen bond 
fonnation. As used herein, a probe may include natural (i.e.. A, G, C, or T) or modified bases 
(7-deazaguanosine, mosine, etc.). In addition, the bases in a probe may be joined by a 
Unkage other than a phosphodiester bond, preferably one that does not functionally mterfere 
with hybridization. Thus, e.g,, probes may be peptide nucleic acids m which the constituent 

10 bases are joined by peptide bonds rather than phosphodiester linkages. Probes may bind 

target sequences lacking complete complementarity with the probe sequence depending upon 
the stringency of the hybridization conditions. The probes are preferably directly labeled, 
e.g., with isotopes, chromophores, lumiphores, chromogens, or indirectly labeled, e.g., with 
biotin to which a streptavidin complex may later bind. By assaying for the presence or 

15 absence of the probe, one can detect the presence or absence of the select sequence or 

subsequence. Diagnosis or prognosis may be based at the genomic level, or at the level of 
RNA or protein expression. 

The term **recombinanf ' when used with reference, e.g., to a cell, or nucleic acid, 
protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been modified by 

20 the introduction of a heterologous nucleic acid or protein or the alteration of a native nucleic 
acid or protein, or that the cell is derived firom a cell so modified. Thus, e.g., recombinant 
cells express genes that are not found within the native (non-recombinant) form of the cell or 
express native genes that are otherwise abnormally expressed, under expressed or not 
expressed at aU. By the term ''recombinant nucleic acid'* herein is meant nucleic acid, 

25 originally formed in vitro, in general, by the manipulation of nucleic acid, e.g., using 
polymerases and endonucleases, in a form not normally found in nature. In this maimer, 
operably linkage of different sequences is achieved. Thus an isolated nucleic acid, ia a linear 
form, or an expression vector formed in vitro by ligating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this inventiorL It is 

30 understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will replicate non-recombmantly, i.e., using the in vivo cellular machinery of the 
host cell rather than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently replicated non-recombinantly, are still considered 
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recombinant for the purposes of the invention- Similarly, a *Vecombinant protein" is a protein 
made using recombinant techniques, i-e., through the expression of a recombinant nucleic 
add as depicted above. 

The term •'heterologous" when used with reference to portions of a nucleic acid 
5 indicates that the nucleic acid comprises two or more subsequences that are not normally 
found in the same relationship to each ottier in nature. For instance, the nucleic acid is 
typically recombinantly produced, having two or more se(fuences, e.g., from unrelated genes 
arranged to make a new functional nucleic acid, e,g., a promoter from one source and a 
coding region from another source. Similarly, a heterologous protein will often refer to two 
10 or more subsequences that are not found m the same relationship to each other in nature (e.g., 
a fusion protein). 

A '^promoter" is ^ically an array of nucleic acid control sequences that direct 
transcription of a nucleic acid As used herem, a promoter includes necessary nucleic acid 
sequences near the start site of transcription, such as, in the case of a polymerase U type 

1 5 promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor 
elements, which can be located as much as several thousand base pairs from the start site of 
transcription. A "constitutive" promoter is a promoter that is active under most 
environmental and developmental conditions. An "inducible" promoter is a promoter that is 
active under enviromnental or developmental regulation. The term "operably linked" refers 

20 to a fimctional linkage between a nucleic acid expression control sequence (such as a 

promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, 
e.g., wherein the expression control sequence directs transcription of the nucleic acid 
corresponding to the second sequence. 

An "expression vector" is a nucleic acid construct, generated recombinanfly or 

25 synthetically, with a series of specified nucleic acid elements that permit transcription of a 
particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or 
nucleic acid fragment Typically, the expression vector includes a nucleic acid to be 
transcribed in operable linkage to a promoter. 

The phrase "selectively (or specifically) hybridizes to" refers to the binding, 

30 duplexing, or hybridizing of a molecule selectively to a particular nucleotide sequence under 
stringent hybridization conditions when that sequence is present in a complex mixture (e.g., 
total cellular or library DNA or RNA). 
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The phrase "stringent hybridization conditions" refers to conditions under which a 
probe will hybridize to its target subsequence, typically in a complex mixture of nucleic 
acids, but to essentially no other sequences. Stringent conditions are sequence-dependent and 
will be different in difPerent circumstances. Longer sequences hybridize specifically at 
5 higiher temperatures. An extensive guide to the hybridization of nucleic acids is found in 
"Overview of principles of hybridization and the strategy of nucleic acid assays" in Tijssen 
(1993) Techniques in Biochemistry and Molecular Biolog y-ltfY^riHiyfltion with Nucleic 
Probes (vol. 24) Elsevier. Generally, stringent conditions are selected to be about 5-10^ C 
lower than the thermal melting point (T„0 for the specific sequence at a defined ionic strength 

10 pH. The Tm is the temperature (under defined ionic strength, pH, and nucleic concentration) 
at which 50% of flie probes complementary to the target hybridize to the target sequence at 
equiUbrium (as the target sequences are present in excess, at Tm, 50% of the probes are 
occupied at equilibriimi). Stringent conditions will be those in which the salt concentration is 
less than about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or 

1 5 other salts) at pH 7.0 to 8.3 and the temperature is at least about 30** C for short probes (e.g., 
10 to 50 nucleotides) and at least about 60** C for long probes (e.g., greater than 50 
nucleotides). Stringent conditions may also be achieved with the addition of destabilizing 
agents such as formamide. For selective or specific hybridization, a positive signal is 
typically at least two times background, preferably 10 times background hybridization. 

20 Exemplary stringent hybridization conditions are often: 50% formamide, 5x SSC, and 1% 
SDS, incubating at 42'' C, or, 5x SSC, 1% SDS, incubating at 65** C, with wash in 0.2x SSC, 
and 0.1% SDS at 65"* C. For PGR, a temperature of about 36** C is typical for low stringency 
amplification, although aimealing temperatures may vary between about 32° C and 48° C 
depending on primer length. For high stringency PCR amplification, a temperature of about 

25 62° C is typical, althougjh high stringency annealing temperatures can range fiom about 50° C 
to about 65° C, depending on the primer length and specificity. Typical cycle conditions for 
both high and low stringency amplifications include a denaturation phase of 90° C - 95° C for 
0.5 - 2 rain., an annealing phase lasting 0.5 - 2 min., and an extension phase of about 72° C 
for 1 - 2 min. Protocols and guidelines for low and high stringency amplification reactions 

30 are provided, e.g., in Innis, et al.(1990) PCR Protocols. A Guide to Methods and 
Applications . 

Nucleic acids that do not hybridize to each other under stringent conditions are still 
substantially identical if the polypeptides which they encode are substantially identical. This 
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occiHS, e.g., when a copy of a nucleic acid is created using the maximmn codon degeneracy 

permitted by the genetic code. In such cases, the nucleic acids typically hybridize under 

moderately stringent hybridization conditions. Exemplary ''moderately stringent 

hybridization conditions" include a hybridization in a buffer of 40% fonnamide, 1 M NaCl, 

5 1% SDS at 37^ C, and a wash in IX SSC at 45° C. A positive hybridization is at least twice 

background. Alternative hybridization and wash conditions can be utilized to provide 

conditions of similar stringency. Additional guidelines for determining hybridization 

parameters are provided in numerous reference, e.g., Ausubel, et al. (ed.) Current Protocols in 

Molecular Biology Lippincott 

10 The phrase ''functional effects'* in the context of assays for testing compounds that 

modulate activity of a lung cancer protein includes the determination of a parameter that is 
indirectly or directly imder the influence of the lung cancer protein or nucleic acid, e.g., a 
physiological, enzymatic, functional, physical, or chemical effect, such as the ability to 
decrease lung cancer. It includes ligand binding activity; cell viability, cell growth on soft 

1 S agar; anchorage dependence; contact inhibition and density limitation of growth; cellular 
proliferation; cellular transformation; growth factor or serum dependence; tumor specific 
marker levels; invasiveness into Matrigel; tumor growth and metastasis in vivo; mRNA and 
protein expression in cells undergoing metastasis, and other characteristics of lung cancer 
cells. "Functional effects" include in vitro, in vrvo, and ex vivo activities. 

20 By "determining the functional effecf ' is meant assaying for a conq)ound that 

increases or decreases a parameter that is indirectly or directly under the influence of a lung 
cancer protein sequence, e.g., physiological, functional, enzymatic, physical, or chemical 
effects. Such functional effects can be measured by many means known to those skilled in 
the art, e.g., changes in spectroscopic characteristics (e.g., fluorescence, absorbance, 

25 refractive index), hydrodynamic (e.g., shape), chiomatogrq)hic, or solubility properties for 
the protein, measuring inducible markers or transcriptional activation of the lung cancer 
protein; measuring binding activity or binding assays, e.g., binding to antibodies or other 
ligands, and measuring cellular proliferation. Detemunation of the functional effect of a 
compoimd on lung cancer can also be performed using limg cancer assays known to those of 

30 skill in the art such as an in vitro assays, e.g., cell growth on soft agai; anchorage 

dependence; contact inhibition and density limitation of growth; cellular proliferation; 
cellular transformation; growth factor or serum depradence; tumor specific marker levels; 
invasiveness into Matrigel; tumor growth and metastasis in vivo; mKNA and protein 
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expression in cells undergoing metastasis, and other characteristics of hing cancer cells. The 

functional effects can be evaluated by many means known to those skilled in the art, e.g., 

microscopy for quantitative or qualitative measures of alterations in morphological features, 

measurement of changes in RNA or protein levels for lung cancer-associated sequences, 

5 measuremmt of RNA stability, identification of downstream or reporter gene expression 

(CAT, luciferase, p-gal, GFP, and the like), e.g., via chemiluminescence, fluorescence, 

colorimetric reactions, antibody binding, inducible markers, and ligand bindmg assays. 

'Inhibitors", "activators", and '^modulators" of lung cancer polynucleotide and 

polypeptide sequences are used to refer to activating, inhibitory, or modulating molecules or 

10 compounds identified using in vitro and in vivo assays of lung cancer polynucleotide and 
polypeptide sequences. Inhibitors are compounds that, e.g., bind to, partially or totally block 
activity, decrease, prevent, delay activation, inactivate, desensitize, or down regulate the 
activity or expression of lung cancer proteins, e.g., antagonists. Antisense or inhibitory 
nucleic acids may seem to inhibit expression and subsequent fimction of the protein. 

15 "Activators" are compoimds that increase, open, activate, facilitate, enhance activation, 
sensitize, agonize, or up regulate lung cancer protem activity. Inhibitors, activators, or 
modulators also include genetically modified versions of lung cancer proteins, e.g., versions 
with altered activity, as well as naturally occurring and synthetic ligands, antagonists, 
agonists, antibodies, small chemical molecules and the like. Such assays for inhibitors and 

20 activators include, e.g., expressing the lung cancer protein in vitro, in cells, or cell 

membranes, applying putative modulator compounds, and then determining flie functional 
effects on activity, as described above. Activators and inhibitors of lung cancer can also be 
identified by incubating lung cancer cells with the test compound and determining increases 
or decreases in the expression of 1 or more lung cancer proteins, e.g., 1, 2, 3, 4, 5, 10, 15, 20, 

25 25, 30, 40, 50 or more lung cancer proteins, such as lung cancer proteins encoded by the 
sequences set out in Tables I A-16. 

Samples or assays comprising lung cancer proteins that are treated with a potential 
activator, inhibitor, or modulator are compared to control samples without the inhibitor, 
activator, or modulator to examine the extent of inhibition. Control samples (untreated with 

30 inhibitors) are assigned a relative protein activity value of 100%. Inhibition of a polypeptide 
is achieved when the activity value relative to the control is about 80%, preferably 50%, more 
preferably 25-0%. Activation of a lung cancer polypeptide is achieved when the activity 
value relative to the control (untreated witii activators) is 1 10%, more preferably 150%, more 
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preferably 200-500% (i.e., two to five fold higher relative to the control), more preferably 

1000-3000% higher. 

The phrase "changes in cell growth" refers to any change in cell growth and 

proliferation characteristics in vitro or in vivo, such as cell viability, formation of foci, 

5 anchorage independence, semi-solid or soft agar growth, changes in contact inhibition and 

density limitation of growth, loss of growth factor or serum requirements, changes in cell 

morphology, gaining or losing immortalization, gaining or losing tumor specific markers, 

ability to form or suppress tumors when injected into suitable animal hosts, and/or 

immortalization of the cell. See, e.g., Freshney (1994) Culture of Animal Cells a Manual of 

10 Basic Technique pp. 231-241 (3"* ed.). 

"Tumor cell" refers to precancerous, cancerous, and normal cells in a tumor. 
"Cancer cells," ^^transformed" cells, or '^transformation" m tissue culture, refers to 
spontaneous or induced phenotypic changes that do not necessarily involve the uptake of new 
genetic material Although transformation can arise fiom infection with a transforming virus 

15 and incorporation of new genomic DNA, or uptake of exogenous DNA, it can also arise 

spontaneously or following exposure to a carcmogen, thereby mutating an endogenous gene. 
Transformation is associated with phenotypic changes, such as immortalization of cells, 
aberrant growth control, norunorphological changes, and/or malignancy (see, Freshney 
(1994) Culture of Animal Cells a Manual of Basic Technique (3"^ ed.)). 

20 "Antibody" refers to a polypeptide comprising a fiamework region firom an 

immunoglobulin gene or fiiagments thereof fliat specifically binds and recognizes an antigen. 
The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, 
epsilon, and mu constant region genes, as well as the myriad immunoglobulin variable region 
genes. Light chains are classified as either ks^pa or lambda. Heavy chains are classified as 

25 gamma, mu, alpha, delta, or epsilon, which in tum define the immunoglobulin classes, IgG, 
IgM, IgA, IgD, and IgE, respectively. Typically, the antigen-binding region of an antibody or 
its functional equivalent will be most critical in specificity and affinity of binding. See Paul, 
Fundamental Immunologv . 

An exemplary immunoglobulin (antibody) stmctural unit comprises a tetramer. Each 

30 tetramer is composed of two identical pairs of polypeptide chains, each pair having one 
"lighf (about 25 kD) and one *lieavy" chain (about 50-70 kD). The N-terminus of each 
chain defines a variable region of about 100 to 1 10 or more amino acids primarily responsible 
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for antigen recognition The terms variable light chain (VO and variable heavy chain (V h) 

refer to these light and heavy chains respectively. 

Antibodies exist, e.g., as intact immunoglobuhns or as a number of well-characterized 

ftagments produced by digestion with various peptidases. Thus, e.g., pepsin digests an 

5 antibody below the disulfide hnkages in the hinge region to produce F(ab)'2, a dimer of Fab 
which itself is a Ught chain joined to Vh-Ch1 by a disulfide bond. The F(ab)'2 may be 
reduced under mild conditions to break the disulfide Unkage in the hinge region, thereby 
converting the F(ab)'2 duner into an Fab' monomer. The Fab' monomer is essentially Fab 
with part of the hinge region (see Paul (ed. 1999) Fundamental Xmmunoiogy (4th eA). While 

10 various antibody fragments are defined in terms of the digestion of an intact antibody, one of 
skill will j^preciate that such fragments may be synthesized de novo either chemically or by 
using recombinant DNA methodology. Thus, the tern antibody, as used herein, also includes 
antibody fiagments either produced by the modification of whole antibodies, or those 
synthesized de novo using recombinant DNA metiiodologies (e.g., single chain Fv) or those 

15 identified using phage display Ubraries (see, e.g., McCafferty, et al. (1990) JMi":© 348:552- 
554). 

For preparation of antibodies, e.g., recombinant, monoclonal, or polyclonal 
antibodies, many technique known in the art can be used (see, e.g., Kohler and Milstein 
(1975) Nature 256:495-497; Kozbor, et al. (1983) Tmrnunologv Todav 4:72; Cole, et al. 

20 (1985), pp. 77-96 in Monoclonal Antibodies an d Cancer Theranv: Coligan (1 991 and 
supplements) Current Protocols in Tmrnunologv: Harlow and Lane (1988) Antibodies, A 
Laboratory Manual: and Coding (1986) Monoclonal A ntibodies: Principles and Practice (2d 
ed.)). Techniques for the production of single chain antibodies (U.S. Patent 4,946,778) can 
be adapted to produce antibodies to polypeptides of this invention. Also, transgenic mice, or 

25 other organisms such as other manunals, may be used to express humanized antibodies. 
Alternatively, phage display technology can be used to identify antibodies and heteromeric 
Fab fragments that specifically bind to selected antigens (see, e.g., McCafiferty, et al. (1990) 
Nature 348:552-554; Marks, et al. (1992) Biotechnolosv 10:779-783). 

A "chimeric antibod/* is an antibody molecule in which, e.g, (a) tiie constant region, 

30 or a portion thereof is altered, replaced, or exchanged so that the antigen binding site 
(variable region) is linked to a constant region of a different or altered class, effector 
fimction, and/or species, or an entirely different molecule which confers new properties to the 
chimeric antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the 
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variable region, or a portion th^of, is altered, replaced, or exchanged with a variable region 

having a dififerent or altered antigen specificity. 

Identification of lung cancer-associated sequences 

5 In one aspect, the expression levels of genes are determined in different patient 

samples for which diagnosis information is desired, to provide expression profiles. An 
expression profile of aparticular sample is essentially a "fingerprint" of the state of the 
sample; while two states may have any particular gene similarly expressed, the evaluation of 
a number of genes simultaneously allows the generation of a gene expression profile that is 

10 characteristic of the state of the cell That is, normal tissue may be distinguished firom 
cancerous or metastatic cancerous tissue, or metastatic cancerous tissue can be compared 
with tissue from surviving cancer patients. By comparing e5q)ression profiles of tissue in 
known different lung cancer states, information regarding which genes are important 
(including both up- and down-regulation of genes) in each of these states is obtained. 

1 5 Molecular profiling may distinguish subtypes of a currently collective disease designation, 
e.g., different forms of lung cancer (chronic disease, adenocarcinoma, etc.) 

The identification of sequences that are differentially expressed in lung cancer versus 
non-lung cancer tissue allows the use of this information in a number of ways. For example, 
a particular treatment regime may be evaluated: does a chemotherapeutic drug act to down- 

20 regulate lung cancer, and thus tumor growth or recurrence, in a particular patient. 

Alternatively, a treatment step may induce other markers which may be used as targets to 
destroy tumor cells. Similarly, diagnosis and treatment outcomes may be done or confirmed 
by comparing patient san^les with the known expression profiles. Malignant diseasemay be 
compared to non-malignant conditions. Metastatic tissue can also be analyzed to determine 

25 the stage of lung cancer in the tissue, or origin of primary tumor, e.g., metastasis from a 

remote primary site. Furthermore, these gene expression profiles (or individual genes) allow 
screening of drug candidates with an eye to mimicking or altering a particular expression 
profile; e.g., screening can be done for drugs that suppress the lung cancer expression profile. 
This may be done by making biochips comprising sets of the important lung cancer genes, 

30 which can then be used in these screens. PGR methods may be applied with selected primer 
pairs, and analysis may be of RNA or of genomic sequences. These methods can also be 
done on the protein basis; that is, protein expression levels of the lung cancer proteins can be 
evaluated for diagnostic purposes or to screen candidate agents. In addition, the lung cancer 
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nucleic acid sequences can be administered for gene therapy purposes, including the 
administration of antisense nucleic acids, or the lung cancer proteins (including antibodies 
and other modulators thereof) administered as therapeutic drugs or as protein or DNA 
vaccines. 

Thus flie present invention provides nuclac acid and protein sequences that are 
differentially expressed in lung cancer relative to normal tissues and/or non-malignant lung 
disease, or in different types of lung disease, herein tenned "lung cancer sequences." As 
outlined below, hmg cancer sequences include those that are up-regulated (i.e., expressed at a 
higher level) in lung cancer, as well as those tiiat are down-regulated (i.e., expressed at a 
lower level). In a preferred embodiment, the lung cancer sequences are fiom humans; 
however, as will be appreciated by those in the art, lung cancer sequences from other 
organisms may be useful in animal models of disease and drug evaluation; thus, other luiig 
cancer sequences are provided, fiom vertd>rates, including mammals, including rodents (rats, 
mice, hamsters, guineapigs, etc.), primates, ferm animals (including sheep, goats, pigs, cows, 
horses, etc.) and pets (dogs, cats, etc.). Lung cancer sequences fiom other organisms may be 
obtained using the techniques outlined below. 

Lung cancer sequences can mclude both nucleic acid and amino acid sequences. As 
wUl be appreciated by those in the art and is more fully outlined below, lung cancer nucleic 
acid sequences are useful in a variety of appUcations, including diagnostic appUcations, 
which will detect naturally occurring nucleic acids, as well as screening plications; e.g., 
biochips comprising nucleic acid probes or PGR microtiter plates with selected probes to the 
lung cancer sequences can be generated. 

A lung cancer sequence can be initiaUy identified by substantial nucleic acid and/or 
amino acid sequence homology to the lung cancer sequences outlined herein. Such 
homology can be based vpon flie overall nucleic acid or amino acid sequence, and is 
generally determined as outlined below, e.g., using homology programs or hybridization 
conditions. 

For identifying lui^ cancer-associated sequences, flie lung cancra: screen typically 
includes comparing genes identified in different tissues, e.g., normal and cancerous tissues, 
canca: and non-maUgnant conditions, non-malignant conditions and normal tissues, or tumor 
tissue samples fiom patients who have metastatic disease vs. non metastatic tissue. Oflier 
suitable tissue comparisons include comparing lung cancer samples with metastatic cancer 
samples fiom otiier cancers, such as, breast, other gastrointestinal cancers, prostate, ovarian, 
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etc. San^les non metastatic disease tissue and tissue undergoing metastasis are applied to 
biochips comprising nucleic add probes. Hie samples are first microdissected, if applicable, 
and treated as is known in the art for the preparation of mRNA. Suitable biochips are 
commerciaUy available, e.g., fix)m Affymetrix, Santa Clara, CA. Geae expression profiles as 
5 described hffein are generated and the data analyzed. 

In one embodiment, the genes showing changes in ejqpression as between normal and 
disease states are compared to genes expressed in other normal tissues, preferably normal 
lung, but also including, and not limited to colon, heart, brain, liver, breast, kidney, muscle, 
prostate, small intestine, large intestine, spleen, bone, and/or placenta. In a preferred 
10 embodiment, those gaies identified during flie lung canca: screen that are expressed in 
significant amounts in other tissues (e.g., essential organs) are removeji firom the profile, 
although in some embodiments, this is not necessary (e.g., where organs may be dispensible 
at a later stage of life). Hiat is, when screoiing for drugs, it is usuaUy preferable that the 
target e3q)ression be disease specific, to minimize possible side effects on otho- organs. 
15 la a preferred embodiment, lung cancer sequences are those that are up-regulated in 

lung cancer; that is, the expression of these geoss is higher in cancerous tissue flian in normal 
lung or other tissue. "Up-regulation" as used herein means, whai the ratio is presented as a 
number greater than one, that the ratio is greater tiian one, preferably 1.5 or greater, more 
preferably 2.0 or greater. Another embodiment is directed to sequences up-regulated in non- 
20 malignant conditions relative to normal. Unigene cluster identification numbers and 

accession numbers herein are for flie GenBank sequoice database and the sequences of flie 
accession numbers are hereby expressly incorporated by reference. GeiBank is known in the 
art, see, e.g., Benson, DA, et al (1998) Nucleic Acids Research 26:1-7 and 
httpy/wwwjicbijitai.mh.gov/. Sequences are also available in other databases, e.g., 
25 European Molecular Biology Laboratory (EMBL) and DNA Database of Japan (DDBJ). 
Anotiier embodiment is directed to sequences up-regulated in non-malignant conditions 
relative to normal. In some situations, the sequences may be derived fix>m assembly of 
available sequences or be predicted torn genomic DNA using exon prediction algorithms, 
such as FGENESH (Salamov and Solovyev (2000) Genome Res. 10:516-522). In other 
30 situations, sequences have been derived from cloning and sequraicing of isolated nucleic 
acids. ' 

In another preferred onbodiment, lung cancer sequences are those that are doyvn- 
legulated in the lung cancer, that is, the egression of these genes is lower in cancerous tissue 
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or normal lung or other tissue. *Down-regulation" as used hCTein means, when the ratio is 

presented as a number greater than one, that the ratio is greater than one, prefoably 1.5 or 

greater, more preferably 2.0 or greater, or, when the ratio is presented as a number less than 

one, that the ratio is less than one, preferably 0,5 or less, more preferably 0.25 or less. 

"5 

Informatics 

The ability to identify genes that are over or under e^q)ress€d in lung cancer can 
additionally provide high-resolution, high-sensitivity datasets which can be used in the areas 
of diagnostics, therapeutics, drug development, pharmacogenetics, protein stracture, 

10 biosensor development, and other related areas. For example, the expression profiles can be 
used m diagnostic or prognostic evaluation of patients with lung cancer. Or as another 
example, subcellular toxicological information can be generated to better direct drug structure 
and activity correlation (see Anderson (1998) Pharmaceutical Proteomics: Targets, 
Mechanism, and Function, paper presented at fte IBC Proteomics conference. Coronado, CA 

15 (June 11-12, 1998)). Subcellular toxicological information can also be utilized in a biological 
sensor device to predict the likely toxicological effect of chemical exposures and likely 
tolerable exposure thresholds (see U.S. Patent No. 5,811,231). Similar advantages accrue 
from datasets relevant to other biomolecules and bioactive agents (e.g., nucleic acids, 
saccharides, lipids, drugs, and the like). 

20 Thus, in another embodiment, the present invention provides a database that includes 

at least one set of assay data. The data contained in the database is acquired, e.g,, using array 
analysis either singly or in a library format. The database can be in a form in which data can 
be maintained and transmitted, but is preferably an electronic database. The electronic 
database of the invention can be maintained on any electronic device allowing for the storage 

25 of and access to the database, such as a personal computer, but is preferably distributed on a 
wide area network, such as the World Wide Web. 

The focus of the present section on databases that include peptide sequence data is for 
clarity of illustration only. It will be apparent to those of skill in the art that similar databases 
can be assembled for assay data acquired using an assay of the invention. 

30 The compositions and methods for identifying and/or quantitating the relative and/or 

absolute abundance of a variety of molecular and macromolecular species from a biological 
sample representing lung cancer, i.e., the identification of lung cancer-associated sequences 
described herein, provide an abundance of information, which can be correlated with 
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pathological conditions, predisposition to disease, drug testing, therapeutic momtonng, gene- 
disease causal linkages, identification of correlates of inununity and physiological status, 
among ofliers. Although the data generated fiom the assays of the invention is suited for 
manual review and analysis, in a preferred embodimait, data processing using higji-speed 

5 computers is utilized. 

An array of methods for indexing and retrieving biomolecular information is known 
in the art For example, US. Patents 6,023,659 and 5,966,712 disclose a relational database 
system for storing biomolecular sequence information in a manner that allows sequences to 
be catalogued and searched according to one or more protem function hierarchies. U.S. 

10 Patent 5,953,727 discloses a relational database having sequaice records containing 
information in a format that allows a collection of partial-length DNA sequences to be 
catalogued and searched according to association with one or more sequencing projects for 
obtaining fiiU-length sequences fiom the coUection of partial length sequences. U.S. Patent 
5,706,498 discloses a gene database retrieval system for making a retrieval of a graie 

15 sequence similar to a sequence data item in a gene database based on the degree of similarity 
between a key sequence and a target sequence. U.S. Patent 5,538,897 discloses a method 
using mass spectroscopy fragmentation patterns of peptides to identify amino acid sequences 
in computer databases by comparison of predicted mass spectra with experimaitally-derived 
. mass spectra using a closeness-of-fit measure. U.S. Patent 5,926,818 discloses a multi- 

20 dimensional database comprising a fimctionality for multi-dimaisional data analysis 
described as on-line analytical processing (OLAP), which entails the consoUdation of 
projected and actual data according to more than one consolidation path or dimension. U.S. 
Patent 5,295,261 reports a hybrid database structure in which the fields of eadi database 
record are divided into two classes, navigational and informational data, with navigational 

25 fields stored in a hierarchical topological which can be viewed as a ti^ee structure or as 
the merger of two or more such tree structures. 

See also Mount, et aL (2001) Bioinfonnatics; Dutbin, et al. (eds., 1999) Biological 
Rftr ^iience Analvsis: Probabilistir. Models of P mtrins and Nucleic Acids (; Baxevanis and 
OeuUette (eds., 1998) Bioinformatics; A Practical G mde to flie Analvsis of Genes and 

30 Proteins^ ; Bastiidi and Buehler (1999^ ^^informatics: Basic Applications in Biological 
y^riftnce and Medicine: Setubal, et al. (eds 1997) Tntroduction to Conmntational Mojecyjy 
Bioloev; Misener and BZrawetz (eds, 2000) Bioinformatics- Methods and Protocols; ffiggins 
and Taylor (eds., 2000) Bioinformatics : Sequence. Sfnictiire. and Databanks: A Practical 
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Approach: Brown (2001) Bioinfonnatics: A Biologists Guide to Biocomput ing and the 

Internet Han and Kamber (2000) Data Mining: Concepts andTechnianes (2000); and 

Waterman (1995) Introduction to Computational Biolopv: Maps. Sequences^ and Genomes. 

The present invention provides a computer database comprising a computer and 

5 software for storing in conq)uter-retrievable form assay data records cross-tabulated, e.g., 

wifli data specifying the source of the target-containing sample from which each sequence 

specificity record was obtained. 

In an exemplary embodiment, at least one of the sources of target-containing sample 

is from a control tissue sample known to be free of pathological disorders. In a variation, at 

1 0 least one of the sources is a known pathological tissue specimen, e.g., a neoplastic lesion or 
another tissue specimen to be analyzed for lung cancer. In another variation, the assay 
records cross-tabulate one or more of the following parameters for each target species in a 
sample: (1) a unique identification code, which can include, e.g., a target molecular structure 
and/or characteristic separation coordinate (e.g., electrophoretic coordinates); (2) sample 

15 source; and (3) absolute and/or relative quantity of the target species present in the sample. 

The invention also provides for the storage and retrieval of a collection of target data 
in a computer data storage apparatus, which can include magnetic (tisks, optical disks, 
magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, magnetic 
bubble memory devices, and other data storage devices, including CPU registers and on-CPU 

20 data storage arrays. Typically, the target data records are stored as a bit pattern in an array of 
magnetic domains on a magnetizable medium or as an array of charge states or transistor gate 
states, such as an array of cells in a DRAM device (e.g., each cell comprised of a transistor 
and a charge storage area, which may be on the transistor). In one embodiment, flie invention 
provides such storage devices, and con:q)uter systems built therewith, comprising a bit pattem 

25 encoding a protein expression fingerprint record conq)rising unique identifiers for at least 10 
target data records cross-tabulated with target source. 

When the target is a peptide or nucleic acid, the invention preferably provides a 
method for identifying related peptide or nucleic acid sequences, conq)rising performing a 
computerized comparison between a peptide or nucleic acid sequence assay record stored in 

30 or retrieved from a computer storage device or database and at least one otiier sequence. The 
comparison can include a sequence analysis or comparison algorithm or computer program 
embodiment thereof (e.g., FASTA, TFASTA, GAP, BESTFTT) and/or the comparison may 
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be of the relative amount of a peptide or nucleic acid sequence in a pool of sequences 

determined from a polypq)tide or nucleic acid sample of a specimen. 

The invention also preferably provides a magnetic disk, such as an IBM-compatible 

(DOS, Windows, Windows95/98/2000, Windows NT, OS/2) or other format (e.g., Linux, 

5 SunOS, Solaris, ADC, SCO Unix, VMS, MV, Macintosh, etc.) floppy diskette or hard (fixed, 

Mnchester) disk drive, comprising a bit pattem encoding data from an assay of the invention 

in a file format suitable for retrieval and processing in a computerized sequence analysis, 

comparison, or relative quantitation method. 

The invention also provides a network, comprismg a plurality of computing devices 

10 linked via a data link, such as an Ethernet cable (coax or lOBaseT), telephone line, ISDN 

line, wireless network, optical fib^, or other suitable signal transmission medium, whereby at 
least one netwoik device (e.g., computer, disk array, etc.) comprises a pattem of magnetic 
domains (e.g., magnetic disk) and/or charge domains (e.g., an array of DRAM cells) 
composing a bit pattem encoding data acquired from an assay of the invention. 

15 The invention also provides a method for transmitting assay data that includes 

generating an electronic signal on an electronic communications device, such as a modem, 
ISDN terminal adapter, DSL, cable modem, ATM switch, or the like, wherein the signal 
includes (in native or encrypted format) a bit pattem encoding data from an assay or a 
database comprising a plurality of assay results obtained by the method of the invention. 

20 In a preferred embodiment, the invention provides a computer system for comparing a 

query target to a database containing an array of data stmctures, such as an assay result 
obtained by the method of the invention, and ranking database targets based on the degree of 
identity and gap weight to the target data. A central processor is preferably mitialized to load 
and execute the computer program for alignment and/or comparison of the assay results. 

25 Data for a query target is entered into the central processor via an I/O device. Execution of 
the compute program results in the central processor retrieving the assay data from the data 
file, which comprises a biliary description of an assay result 

The target data or record and the computer program can be transferred to secondary 
memory, which is typically random access memory (e.g., DRAM, SRAM, SGRAM, or 

30 SDRAM). Targets are ranked according to the degree of correspondence between a selected 
assay characteristic (e.g., binding to a selected affinity moiety) and the same characteristic of 
the query target and results are output via an I/O device. For example, a central processor 
can be a conventional computer (e.g., Intel Pentium, PowerPC, Alpha, PA-8000, SPARC, 
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MIPS 4400, MIPS 10000, VAX, etc.); a program can be a commercial or public domain 

molecular biology software package (e.g., UWGCG Sequence Analysis Software, Darwin); a 

data file can be an optical or m^etic disk, a data server, a memory device (e.g., DRAM, 

SRAM, SGRAM, SDRAM, EPROM, bubble memory, flash memory, etc.); an I/O device can 

5 be a temiinal coii^rising a video display and a keyboard, a modem, an ISDN terminal 

adapter, an Ethernet port, a pundied card reader, a magnetic strip reader, or other suitable I/O 

device. 

The invention also preferably provides the use of a computer system, such as that 
described above, which comprises: (1) a conq)Uter, (2) a stored bit pattern encoding a 
10 collection of peptide sequence specificity records obtained by the methods of the invention, 
which may be stored in the computer, (3) a comparison target, such as a query target; and (4) 
a program for alignment and comparison, typically with rank-ordering of comparison results 
on the basis of computed similarity values. 

1 5 Characteristics of lung cancer-associated proteins 

Lung cancer proteins of the present invention may be classified as secreted proteins, 
transmembrane proteins or intracellular proteins. In one embodiment, the lung cancer protein 
is an intracellular protein. Intracellular proteins may be found in the cytoplasm and/or in the 
nucleus. Intracellular proteins are involved in all aspects of cellular fimction and replication 

20 (including, e.g., signaling pathways); aberrant e3q)ression of such proteins often results in 
imregulated or disregulated cellular processes (see, e.g., Alberts (ed. 1994) Molecular 
Biology of the Cell (3d ed.). For example, many intracellular proteins have en2ymatic 
activity such as protein Idnase activity, protein phosphatase activity, protease activity, 
nucleotide cyclase activity, polymerase activity and the like. Intracellular proteins also serve 

25 as docking proteins that are involved in organizing complexes of proteins, or targeting 
proteins to various subcellular localizations, and are involved in maintaining the structural 
integrity of organelles. 

An increasingly s^preciated concq)t in characterizing proteins is the presence in the 
proteins of one or more structural motifs for which defined fimctions have been attributed. In 

30 addition to the highly conserved sequences found in the enzymatic domain of proteins, highly 
conserved sequences have been identified in proteins that are involved in protein-protein 
interaction. For example, Src-homology-2 (SH2) domains bind tyrosine-phosphorylated 
targets in a sequence dependent manner. PTB domains, which are distinct from SH2 
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domains, also bind tyrosine phosphorylated targets. SID domains bind to proline-rich 

targets. In addition, PH domains, tetratdcopeptide repeats and WD domains to name only a 

few, have been shown to mediate protein-protein interactions. Some of these may also be 

involved in binding to phospholipids or other second messengers. As will be appreciated by 

5 one of ordinary skill in flie art, these motifs can be identified on the basis of amino acid 

sequence; thus, an analysis of the sequence of proteins may provide insight into both the 

enzymatic potential of the molecule and/or molecules with which the protein may associate. 

One useful database is Pfam (protein families), which is a large collection of multiple 

sequence aUgnments and hidden Markov models covering many common protein domains. 

10 Versions are available via the internet from Washington University in St Louis, the Sanger 

Center in Bigland, and the Karolinska Institute in Sweden (see, e.g., Bateman, et al (2000) 

Nuc. Acids Res. 28:263-266; Sonnhammer, et aL (1997) Proteins 28:405-420; Bateman, et aL 

(1999) Nuc. Acids Res. 27:260-262; and Sonnhammer, et al. (1998) Nuc. Acids Res. 26:320- 

322). 

IS In another embodiment, the lung cancer sequences are transmembrane proteins. 

Transmembrane proteins are molecules that span a phospholipid bilayer of a cell. They may 
. have an intracellular domain, an extracellular domain, or both. The intracellular domains of 
such proteins may have a number of functions including those already described for 
intracellular proteins. For example, the intracellular domain may have enzymatic activity 

20 and/or may serve as a binding site for additional proteins. Frequently the intracellular 

domain of transmembrane proteins serves both roles. For example certain recq^tor tyrosine 
kinases have both protein kinase activity and SH2 domains. In addition, autophosphorylation 
of tyrosines on the receptor molecule itseli^ creates binding sites for additional SH2 domain 
containing proteins. 

25 Transmembrane proteins may contain from one to many transmembrane domains. 

For example, receptor tyrosine kinases, certain cytokine receptors, receptor guanylyl cyclases 
and receptor serine/threonine protein kinases contain a single transmembrane domain. 
However, various other proteins including channels, pumps, and adenylyl cyclases contain 
numeroiis transmembrane domains. Many important cell surfece receptors such as G protein 

30 coupled receptors (GPCRs) are classified as "seven transmembrane domain" proteins, as they 
contain 7 membrane spaiuiing regions. Characteristics of transmembrane domains include 
approximately 17 consecutive hydrophobic amino acids that may be followed by charged 
amino acids. Therefore, upon analysis of the amino acid sequence of a particular protein, the 
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localization and number of transmembrane domains within tbe protdn may be predicted (see, 
e.g., PSORT web site ht^:/ypsorLnibb,ac.jp/). 

The extracellular domains of transmembrane proteins are diverse; however, conserved 
motife are found repeatedly among various extracellular domains. Conserved structure 
5 and/or functions have been ascribed to different extracellular motife. Many extracellular 
domains are involved in binding to other molecules. In one aspect, extracellular domains are 
found on receptors. Factors that bind the receptor domain include circulating ligands, which 
may be peptides, proteins, or small molecules such as adenosine and the like. For example, 
growth factors such as EGF, FGF, and PDGF are circulating growth factors that bind to their 

10 cognate receptors to initiate a variety of cellular responses. Other factors include cytokines, 
mitogenic fectors, hormones, nem-otrophic factors and the like. Extracellular domains also 
bind to cell-associated molecules. In this respect, they may mediate cell-cell interactions. 
Cell-associated ligands can be tethered to the cell, e.g., via a glycosylphosphatidylinositol 
(GPI) anchor, or may themselves be transmembrane proteins. Extracellular domains may 

15 also associate with the extracellular matrix and contribute to the maintenance of the cell 
structure. 

Lung cancer proteins that are transmembrane are particularly preferred in the present 
invention as they are readily accessible targets for extracellular immunotherapeutics, as are 
described herein. In addition^ as outlined below, transmembrane proteins can be also useful 

20 in imaging modalities. Antibodies may be used to label such readily accessible proteins in 
situ or in histological analysis. Alternatively, antibodies can also label intracellular proteins, 
in which case analytical samples are typically permeablized to provide access to intracellular 
proteins. In addition, some membrane proteins can be processed to release a soluble protein, 
or to expose a residual j&agment. Released soluble proteins may be useful diagnostic 

25 markers, processed residual protein fragments may be useful lung markers of disease. 

It will also be appreciated by those in the art that a transmembrane protein can be 
made soluble by removing transmembrane sequences, e.g., through recombinant methods. 
Furthermore, transmembrane proteins that have been made soluble can be made to be 
secreted through recombinant means by adding an appropriate signal sequence. 

30 In another embodiment, the lung cancer proteins are secreted proteins; the secretion of 

which can be either constitutive or regulated. These proteins may have a signal peptide or 
signal sequence that targets the molecule to flie secretory pathway. Secreted proteins are 
involved in numerous physiological events; e.g., if circulating, they often serve to transmit 
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signals to various ofSiGX cell typ^. The secreted protein may function in an autocrine manner 

(acting on the cell that secreted the factor), a paracrine manner (acting on cells in close 

proximity to the cell that secreted the factor), an wdocrine manner (acting on cells at a 

distance, e.g., secretion into the blood stream), or exocrine (secretion, e.g., through a duct or 

S to adjacent epithelial surface as sweat glands, sebaceous glands, pancreatic ducts, lacrimal 

glands, mammary glands, sax producing glands of the ear, eto.). Thus secreted molecules 

often find use in modulating or altering numerous aspects of physiology. Lung cancer 

proteins that are secreted proteins are particularly preferred in the present invention as they 

serve as good targets for diagnostic markers, e.g., for blood, plasma, serum, or stool tests. 

10 Those which are enzymes may be antibody or small molecule targets. Others may be useful 

as vaccine targets, e.g., via CTL mechanisms. 

Use of lung cancer nucleic acids 

As described above, limg cancer sequence is initially identified by substantial nucleic 
15 acid and/or amino acid sequence homology or linkage to the limg cancer sequences outlined 
herein. Such homology can be based upon the overall nucleic acid or amino acid sequence, 
and is generally determined as outlined below, usmg either homology programs or 
hybridization conditions. Typically, linked sequences on a mRNA are foimd on the same 
molecule. 

20 The lung cancer nucleic acid sequences of the invention, e.g., the sequences in Tables 

1 A-16, can be fi:agments of larger genes, i.e., they are nucleic acid segments. "Genes" in this 
context includes coding regions, non-coding regions, and mixtures of coding and non-coding 
regions. Accordingly, as will be appreciated by those in the art, using the sequences provided 
herein, extended sequences, in either direction, of the lung cancer genes can be obtained, 

25 using techniques well known in the art for cloning eiflier longer sequences or the full length 
sequences; see Ausubel, et al., stq>ra. Much can be done by informatics and many sequences 
can be clustered to include multiple sequences correq)onding to a single gene, e.g., systems 
such as UniGene (see, http-7/www.ncbi.nlm.nih.govAJmGene/), 

Once a lung cancer nucleic acid is identified, it can be cloned and, if necessary, its 

30 constituent parts recombined to form the entire lung cancer nucleic acid coding regions or the 
entire mKNA sequence. Once isolated firom its natural source, e.g., contained within a 
plasmid or other vector or excised therefirom as a linear nucleic acid segment, the 
recombinant lung cancer nucleic acid can be further-used as a probe to identify and isolate 
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oflier lung cancer nucleic acids, e.g., extended coding regions. It can also be used as a 
• "precursor" nucleic acid to make modified or variant lung cancer nucleic acids md proteins. 
The lung cancer nucleic acids of the presmt invention are used in several ways. In a 
first embodiment, nucleic acid probes to the lung cancer nucleic acids are made and attached 

5 to biochips to be used in screening and diagnostic methods, as outlined below, or for 
administration, e.g., for gene therapy, RNAi, vaccine, and/or antisense appUcations. 
Alternatively, the lung cancCT nucleic acids that include coding regions of lung cancer 
proteins can be put into expression vectors for the expression of lung cancer proteins, again 
for screening purposes or for administration to a patient 

10 In a preferred embodiment, nucleic acid probes to lung cancer nucleic acids (both flie 

nucleic acid sequences outlined in the figures and/or the complements thereof) are made. 
The nucleic acid probes attached to the biochip are designed to be substantially 
complementary to the lung cancer nucleic acids, i.e., the target sequoice (either the target 
sequence of the sample or to other probe sequences, e.g., in sandwich assajre), such that 

15 hybridization of the target sequence and the probes of the present invention occurs. As 
outlined below, this complementarity need not be perfect; fliere may be any number of base 
pair mismatches which wiU interfere with hybridization between flie target sequence and the 
single stranded nucleic acids of the present invention. However, if tiie number of mutations 
is so great that no hybridization can occur under even tiie least stringent of hybridization 

20 conditions, the sequence is not a complementary target sequence. Thus, by "substantially 
complementary" herein is meant tiiat the probes are sufiBcientiy complementary to the target 
sequences to hybridize under appropriate reaction conditions, particularly high stiingency 
conditions, as outlined hereiiL 

A nucleic acid probe is generally single stianded but can be partially single and 

25 partially double stranded. The strandedness of the probe is dictated by ttie structure, 

composition, and properties of the target sequence. In general, the nucleic acid probes range 
from about 8 to about 100 bases long, with bom about 10 to about 80 bases being preferred, 
and firom about 30 to about 50 bases being particularly preferred. That is, generally 
complements of ORFs or whole genes are not used. In some embodiments, nucleic acids of 

30 lengths up to hundreds of bases can be used. 

In a preferred embodiment, more than one probe per sequence is used, wifli either 
overlapping probes or probes to different sections of the target bang used. That is, two, 
three, four or more probes, with tiiree being preferred, are used to build in a redundancy for a 
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particular target The probes can be bv^lapping (i.e., have some sequence m common), or 
separate. In some cases, PGR primers may be used to amplify signal for higher sensitivity. 

As will be appreciated by those in the art, nucleic acids can be attached or 
immobilized to a soUd support in a wide variety of ways. By ^Immobilized'' and grammatical 

5 equivalents herein is meant the association or binding betwem flie nucleic acid probe and the 
solid support is sufficient to be stable under the conditions of binding, washing, analysis, and 
removal as outlined below. The binding can typically be covalent or non-covalent. By**non- 
covalent binding" and grammatical equivalents herein is typically meant one or more of 
electrostatic, hydrophilic, and hydrophobic interactions. Included in non-<»valent binding is 

10 the covalent attachment of a molecule, such as, streptavidin to the support and the non- 
covalent binding of the biotinylated probe to the streptavidin. By "covalent bindmg" and 
grammatical equivalents herein is meant that the two moieties, the solid support and the 
probe, are attached by at least one bond, including sigma bonds, pi bonds and coordination 
bonds. Covalent bonds can be formed directly between the probe and the soUd support or can 

15 be formed by a cross linker or by mclusion of a specific reactive group on either the solid 
support or the probe or both molecules. Lnmobilization may also involve a combmation of 
covalent and non-covalent interactions. 

In general, the probes are attached to a biochip in a wide variety of ways, as will be 
appreciated by those in die art As described herein, the nucleic acids can either be 

20 synthesized first, wifli subsequent attachment to the biochip, or can be directly synthesized on 
the biochip. 

The biochip comprises a suitable solid substrate. By "substrate" or "solid support" or 
other grammatical equivalents herein is meant a material that can be modified for the 
attachment or association of the nucleic acid probes and is amenable to at least one detection 

25 method Often ttie substrate may contain discrete individual sites appropriate for ndivitual 
partitioning and identification. As will be appreciated by those in the art, the number of 
possible substrates are very large, and include, but are not limited to, glass and modified or 
fimctionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and 
other materials, polypropylene, polyethylene, polybutylene, polyurethanes, Teflon, etc.), 

30 polysaccharides, nylon or nitrocellulose, resms, silica or silica-based materials including 
silicon and modified silicon, carbon, metals, inorganic glasses, plastics, etc. In general, the 
substrates allow optical detection and do not appreciably fluoresce, A preferred substrate is 
described in US application entitled Reusable Low Fluorescrat Plastic Biochip, U.S. 
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AppUcation Serial No. 09/270^14, filed March 15, 1999, herein incorporated by reference in 
its entirety. 

Generally the substrate is planar, although as will be appreciated by those in the art, 
other configurations of substrates may be used as well. For example, the probes may be 

5 placed on the inside surface of a tube, for flow-through sample analysis to minimize sample 
volume. Similarly, the substrate may be flexible, such as a flexible foam, including closed 
cell foams made of particular plastics. 

In a prefened embodiment, the surface of the biochip and the probe may be 
derivatized with chenoical functional groups for subsequent attachment of the two. Thus, e.g., 

1 0 the biochip is derivatized with a chemical functional group including, but not limited to, 
amino groups, carboxy groups, oxo groups and thiol groups, with amino groups being 
particularly preferred. Using these fimctional groups, the probes can be attached using 
functional groups on the probes. For example, nucleic acids containing amino groups can be 
attached to surfaces comprising amino groups, e.g., using Unkers as are known in the art; e.g., 

15 homo-or hetero-bifimctional linkers as are well known (see 1994 Pierce Chemical Company 
catalog, technical section on cross-linkers, pages 155-200). In addition, in some cases, 
additional Unkers, such as alkyl groups (including substituted and heteroalkyl groups) may be 
used. 

In this embodiment, oligonucleotides are synthesized, and then attached to the surface 
20 of the solid support. Either the 5' or 3* terminus may be attached to the solid support, or 

attachment may be via linkage to an internal nucleoside. 

In another embodiment, the immobilization to the solid support may be very strong, 

yet non-covalent. For example, biotinylated oligonucleotides can be made, which bind to 

surfaces covalently coated with streptavidin, resulting in attachment. 
25 Alternatively, the oligonucleotides may be synthesized on the surface, as is known in 

the art. For example, photoactivation techniques utilizing photopolymerization compounds 

and techniques are used. In a preferred embodiment, the nucleic acids can be synthesized in 

situ, using known photolithographic techniques, such as those described in WO 95/251 16; 

WO 95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references cited within, all of 
30 which are expressly incorporated by reference; these methods of attachment form the basis of 

the Affymetrix GeneChip™ technology. 

Often, amplification-based assays are performed to measure the expression level of 

lung cancer-associated sequences. These assays are typically performed in conjunction with 
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reverse transcription. In such assays, a lung cancer-associated nucleic acid sequence acts as a 
template in an amplification reaction (e.g., Polymerase Chain Reaction, or PGR). In a 
quantitative amplification, flie amount of amplification product will be proportional to the 
amount of template in the original sample. Comparison to appropriate controls provides a 
5 measure of the amount of lung cancer-associated RNA. Methods of quantitative 

amplification are well known to those of skill in the art. Detailed protocols for quantitative 
PGR are provided, e.g., in hmis, et al. (1990) PGR Protocols. A Guide to Methods and 
A pplications . 

In some embodiments, a TaqMan based assay is used to measure expression. 

10 TaqMan based assays use a fluorogenic oligonucleotide probe that contains a 5' fluorescent 
dye and a 3' quenching agent The probe hybridizes to a PGR product, but cannot itself be 
extended due to a blocking agent at the 3* end. When the PGR product is amplified in 
subsequent cycles, the 5' nuclease activity of the polymerase, e.g., AmpIiTaq, results in the 
cleavage of the TaqMan probe. This cleavage separates the 5* fluorescent dye and the 3' 

1 5 quenching agent, thereby resulting in an increase in fluorescence as a fimction of 

amplification (see, e.g., literature provided by Perkin-Ehner, e.g., www2.perkin-ehner.com). 

Other suitable amplification methods include, but are not Umited to, ligase chain 
reaction (LCR) (see Wu and Wallace (1989) Genomics 4:560, Landegren, et al. (1988) 
Science 241:1077, and Barringer, et al. (1990) Gene 89:117), transcription amphfication 

20 (Kwoh, et al (1989) Proc. Natl. Acad. Sci. USA 86:1 173), self-sustained sequence 

repUcation (Guatelli, et al. (1990) Proc. Nat. Acad. Sci. USA 87: 1874), dot PGR, and linker 
adapter PGR, etc. 

Expression of lung cancer proteins from nucleic acids 

25 In a preferred embodiment, lung cancer nucleic acids, e.g., encoding lung cancer 

proteins, are used to make a variety of expression vectors to express lung cancer proteins 
which can then be used in screening assays, as described below. Expression vectors and 
recombmant DNA technology are well known to those of skill in the art (see, e.g., Ausubel, 
supra, and Fernandez and Hoeffler (eds 1999) Gene Expression Systems) and are used to 

30 express proteins. The expression vectors may be either self-replicating extrachromosomal 
vectors or vectors which integrate into a host genome. Generally, these expression vectors 
include transcriptional and translational regulatory nucleic acid operably linked to the nucleic 
acid encoding the lung cancer protein. The term "control sequences*' refers to DNA 
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sequences used for the expression of an operably linked coding sequence in a particular host 
organism. Control sequences that are suitable for prokaryotes, e.g., include a promoter, 
optionally an operator sequence, and a ribosome binding site. Eukaryotic cells are known to 
utilize promoters, polyadenylation signals, and aihancars. 

5 Nucleic acid is "operably linked" when it is placed into a functional relationship with 

another nucleic acid sequence. For example, DNA for a presequence or secretory leader is 
operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in 
the secretion of the polypeptide; a promoter or enhancer is operably linked to a coding 
sequence if it affects the transcaiption of tire sequence; or a ribosome binding site is operably 

10 linked to a coding sequence if it is positioned so as to fecilitate translation. Generally, 
"operably linked" means that the DNA sequences being linked are contiguous, and, in the 
case of a secretory leader, contiguous and in reading phase. However, enhancers do not have 
to be contiguous. Linking is typically accomplished by ligation at convenient restriction 
sites. If such sites do not exist, synthetic oUgonucleotide adaptors or Unkers are used in 

15 accordance with conventional practice. Transcriptional and translational regulatory nucleic 
acid will generally be appropriate to flie host cell used to express flie lung cancer protein. 
Numerous types of appropriate expression vectors, and suitable regulatory sequences are 
known in the art for a variety of host cells. 

In general, transcriptional and translational regulatory sequences may include, but are 

20 not limited to, promoter sequences, ribosomal binding sites, transcriptional start and stop 
sequences, translational start and stop sequences, and enhancer or activator sequences, hi a 
preferred embodiment, the regulatory sequences include a promoter and transcriptional start 
and stop sequences. 

Promoter sequences may be either constitutive or inducible promoters. The promoters 
25 may be either naturally occurring promoters or hybrid promoters. Hybrid promoters, which 
combine elements of more than one promoter, are also known m the art, and are useful in the 
present invention. 

In addition, an expression vector may comprise additional elements. For example, the 
expression vector may have two replication systems, thus allowing it to be maintained in two 
30 organisms, e.g., in mammalian or insect cells for expression and in a prokaryotic host for 
cloning and amplification. Furthermore, for integrating expression vectors, the expression 
vector often contdns at least one sequence homologous to the host cell genome, and 
preferably two homologous sequences which flank die e3q)ression construct. The mtegrating 
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vector may be directed to a specific locus in the host cell by selecting the appropriate 
homologous sequence for inclusion in the vector. Constructs for integrating vectors are weU 
known in the art (e.g., Fernandez and Hoefifler, stq}ra). 

In addition, in a preferred embodiment, the expression vector contains a selectable 
5 marker gene to allow the selection of transformed host cells. Selection genes are weU known 
in the art and will vary with the host cell used. 

The lung cancer proteins of the present invention are usually produced by culturing a 
host cell transformed with an expression vector containing nucleic acid encoding a lung 
cancer protem, under the appropriate conditions to induce or cause expression of the lung 
10 cancer protein. Conditions appropriate for lung cancer protein expression will vary with the 
choice of the expression vector and the host ceU, and wiU be easily ascertained by one skilled 
in the art through routine experimentation or opthnization. For example, the use of 
constitutive promoters in the expression vector will require optimizing the growth and 
proUferation of the host ceU, while the use of an inducible promoter requires the appropriate 
15 growth conditions for induction. In addition, in some embodiments, the timing of the harvest 
is important For example, the baculoviral systems used in insect cell expression are lytic 
viruses, and thus harvest time selection can be crucial for product yield. 

Appropriate host cells include yeast, bacteria, archaebacteria, fungi, and insect and 
animal cells, including mammalian cells. Of particular interest are Saccharomyces cerevisiae 
20 and other yeasts, E. colU Bacillus subtilis, Sf9 cells, C129 cells, 293 cells, Neurospora, BHK, 
CHO, COS, HeLa cells, HUVEC (human umbiUcal vein endothelial cells), THPl cells (a 
macrophage cell line) and various other human cells and cell lines. 

In a preferred embodiment, the lung cancer proteins are expressed in mammalian 
cells. MammaUan expression systems are also known in the art, and include retroviral and 
25 adenoviral systems. Ofparticular use as mammalian promoters are the promoters fix)m 
mammalian viral genes, since the viral genes are often highly expressed and have a broad 
host range. Examples include the S V40 early promoter, mouse mammary tumor virus LTR 
promoter, adenovirus major late promoter, herpes simplex virus promoter, and the CMV 
promoter (see, e.g., Fernandez and Hoefiler, supra). Typically, transcription termination and 
30 polyadenylation sequences recognized by mammalian cells are regulatory regions located 3' 
to the translation stop codon and thus, together with the promoter elements, flank the coding 
sequence. Examples of transcription terminator and polyadenylation signals include those 
derived form SV40. 
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The methods of mtroducing exogenous nucleic acid into mammalian hosts, as well as 
other hosts, is well known in the art, and will vary with the host cell used. Techniques 
include dextran-mediated transfection, calcium phosphate precipitation, polybrene mediated 
transfection, protoplast fusion, electroporation, viral infection, encapsulation of the 

5 polynucleotide(s) in liposomes, and direct microinjection of the DNA into nuclei. 

In a preferred embodiment, lung cancer proteins are expressed in bacterial systems. 
Promoters from bacteriophage may also be used and are known in the art. In addition, 
synthetic promoters and hybrid promoters are also useful; e.g., the tac promoter is a hybrid of 
the tip and lac promoter sequences. Furthermore, a bacterial promoter can include naturally 

10 occurring promoters of non-bacterial origin that have the ability to bind bacterial KNA 
polymerase and initiate transcription. In addition to a functioning promoter sequence, an 
efficient ribosome binding site is desirable. The expression vector may also include a signal 
peptide sequence that provides for secretion of the lung cancer protein in bacteria. The 
protein is either secreted into the growth media (gram-positive bacteria) or into the 

15 periplasmic space, located between the inner and outer membrane of the cell (gram-negative 
bacteria). The bacterial expression vector may also include a selectable marker gene to allow 
for the selection of bacterial strains that have been transformed. Suitable selection genes 
include genes which raider the bacteria resistant to drugs such as ampicillin, 
chloramphenicol, erythromycin, kanamycin, neomycin and tetracycline. Selectable markers 

>0 also include biosynthetic genes, such as those in the histidine, tryptophan and leucine 

biosynthetic pathways. These components are assembled mto expression vectors. Expression 
vectors for bacteria are well known in flie art, and include vectors for Bacillus subtilis, E. 
colU Streptococcus cremoris^ and Streptococcus lividans, among others (e.g., Fernandez and 
Hoeffler, supra). The bacterial e3q)ression vectors are transformed into bacterial host cells 

15 using techniques well known in the art, such as calcium chloride treatment, electroporation, 
and others. 

In one embodiment, lung cancer proteins are produced in insect cells. Expression 
vectors for the transformation of insect cells, and in particular, baculovirus-based expression 
vectors, are well known in the art 
JO In a preferred embodiment, lung cancer protein is produced in yeast cells. Yeast 

expression systems are well known in the art, and include expression vectors for 
Saccharomyces cerevisiae, Candida albicans and C. ntaltosa^ Hansenula polymorpha^ 
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Kluyveromycesjragilis and K. lactis, Pichia guillerimondii, and P. pastoris, 

Schizosaccharomyces pombe, and Yarrowia lipolytica. 

The lung cancer protein may also be made as a fusion protein, using techniques well 

known in the art. Thus, e.g., for die creation of monoclonal antibodies, if the desired epitope 

5 is smaU, the lung cancer protein may be fused to a carrier protein to form an immunogen. 
Alternatively, the lung cancer protein may be made as a fiision protein to increase expression 
for affinity purification purposes, or for other reasons. For example, when the lung cancer 
protein is a lung cancer peptide, the nucleic acid encoding the peptide may be linked to other 
nucleic acid for expression purposes. 

10 In a preferred embodiment, the lung cancer protein is purified or isolated after 

expression. Lung cancer proteins may be isolated or purified in a variety of appropriate 
ways. Standard purification methods mclude electrophoretic,.molecular, immunological and 
chromatographic techniques, including ion exchange, hydrophobic, afBnity, and reverse- 
phase HPLC chromatography, and chromatofocusing. For example, the lung cancer protein 

15 may be purified using a standard anti-lung cancer protein antibody column. Ultrafiltration 
and diafiltration techniques, in conjunction with protein concentration, are also useful. For 
general guidance in suitable purification techniques, see Scopes (1982) Protein Purification. 
The degree of purification necessary will vary depending on the use of the lung cancer 
protein. In some instances no purification will be necessary. 

20 Once expressed and purified if necessary, the lung cancer proteins and nucleic acids 

are usefiil in a number of applications. They may be used as immunoselection reagents, as 
vaccine reagents, as screening agents, therapeutic entities, for production of antibodies, as 
transcription or translation inhibitors, etc. 

25 Variants of lung cancer proteins 

In one embodiment, the lung cancer proteins are derivative or variant lung cancer 
proteins as compared to the wild-type sequence. That is, as outlined more fiilly below, the 
derivative lung cancer peptide will often contam at least one amino acid substitution, deletion 
or insertion, with amino acid substitutions being particularly preferred. The amino acid 

30 substitution, insertion or deletion may occur at a particular residue within the lung cancer 
peptide. 

Also included within one embodiment of lung cancer proteins of the present invention 
are amino acid sequence variants. These variants typically fall into one or more of three 
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classes: substitudonal, ins^onal or deletional variants. These variants ordinarily are 
prepared by site specific mutagenesis of nucleotides in the DNA encoding the lung cancer 
protein, using cassette or PGR mutagenesis or other techniques, to produce DNA encoding 
the variant, and thereafter expressing the DNA in recombinant cell culture as outlined above. 
5 However, variant lung cancer protein fiagments having up to about 100-150 residues may be 
prepared by in vitro synthesis. Amino acid sequence variants are characterized by the 
predetermined nature of the variation, a feature that sets them apart from naturally occurring 
allelic or interspecies variation of the lung cancer protein amino acid sequence. Hie variants 
typically exhibit a similar qualitative biological activity as the naturally occurring analogue, 

10 although variants can also be selected which have modified characteristics as will be more 
fiilly outlined below. 

While the site or region for introducing an amino acid sequence variation is often 
predetemiined, the mutation per se need not be predetermined. For example, in order to 
optimize the perfomiance of a mutation at a given site, random mutagenesis may be 

1 5 conducted at the target codon or region and the expressed lung cancer variants screened for 
the optimal combmation of desired activity. Techniques exist for making substitution 
mutations at predetermined sites in DNA having a known sequence, e.g.. Ml 3 primer 
mutagenesis and PGR mutagenesis. Screening of mutants is often done using assays of lung 
cancer protein activities. 

20 Amino acid substitutions are typically of single residues; insertions usually will be on 

the order of from about 1 to 20 amino acids, although considerably larger insertions may be 
occasionally tolerated. Deletions generally range froin about 1 to about 20 residues, although 
in some cases deletions may be much larger. 

Substitutions, deletions, insertions or any combination thereof may be used to arrive 

25 at a final derivative. Generally these changes are done on a few amino acids to minimize the 
alteration of the molecule. Larger changes may be tolerated in certain circumstances. When 
small alterations in the characteristics of a lung cancer protein are desired, substitutions are 
generally made in accordance with the amino acid substitution chart provided in the 
definition section. 

30 Variants typically exhibit essentially the same qualitative biological activity and will 

elicit the same immune response as a naturally-occuiring analog, although variants also are 
selected to modify the characteristics of lung cancer proteins as needed. Alternatively, the 
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variant may be designed or reorganized such that a biological activity of the lung cancer 
protein is altered. For example, glycosylation sites may be added, altered, or removed- 

Covalent modifications of lung cancer polypeptides are included within the scope of 
this invention. One type of covalent modification includes reacting targeted amino acid 
5 residues of a lung cancer polypeptide with an organic derivatizing agent that is capable of 
reacting with selected side chains or the N-or C-terminal residues of a lung cancer 
polypeptide. Derivatization with bifimctional agents is usefiil, for instance, for crossUnking 
lung cancer polypeptides to a water-insoluble support matrix or surface for use in a method 
for purifying anti-lung cancer polypeptide antibodies or screening assays, as is more fiilly 

10 described below. Commonly used crosslinking agents include, e.g., l,l-bis(dia2oacetyl)-2- 
phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, e.g., esters with 4-azidosaUcylic 
acid, homobifimctional imidoesters, mcluding disuccinimidyl esters such as 3,3*- 
dithiobis(succiaimidy^ropionate), bifimctional maleimides such as bis-N-maleimido-1,8- 
octane and agents such as mettiyl-3-((p-azidophenyl)dithio)propipimidate. 

15 Other modifications include deamidation of glutaminyl and asparaginyl residues to 

the corresponding glutamyl and aspartyl residues, respectively, hydroxylation of proline and 
lysine, phosphoiylation of hydroxyl groups of serinyl, threbnyl or tyrosyl residues, 
methylation of the y-amino groups of lysine, arginine, and histidine side chains (Creighton 
(1983) Proteins: Structure and Molecular Properties, pp, 79-86), acetylation of the N-te^minal 

20 amine, and amidation of any C-terminal carboxyl group. 

Another type of covalent modification of the lung cancer polypeptide encompassed by 
this invention is an altered native glycosylation pattern of the polypeptide. "Altering the 
native glycosylation pattern" is intended herein to mean adding to or deleting one or more 
carbohydrate moieties of a native sequence lung cancer polypeptide. Glycosylation patterns 

25 can be altered in many ways. For exan^le the use of different cell types to express lung 
cancer-associated sequences can result in different glycosylation patterns. 

Addition of glycosylation sites to lung cancer polypeptides may also be accompUshed 
by altering the amino acid sequence thereof. The alteration may be made, e.g., by the 
addition of, or substitution by, one or more serine or threonine residues to the native sequence 

30 lung cancer polypeptide (for O-linked glycosylation sites). The lung cancer amino acid 
sequence may optionally be altered through changes at the DNA level, particularly by 
mutating the DNA encoding the lung cancer polypeptide at preselected bases such that 
codons are generated that will translate into the desired amino acids. 
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Another means of increasing the number of carbohydrate moieties on the limg cancer 

polypeptide is by chotnical or enzymatic coupling of glycosides to the polypeptide. Such 

methods are described in the art, e.g., in WO 87/05330, and in Aplin and Wriston (1981) 

CRC Crit Rev. Biochem.. pp. 259-306. 

5 Removal of carbohydrate moieties present on the lung cancer polypeptide may be 

accomplished chemically or enzymatically or by mutational substitution of codons encoding 

for ammo acid residues that serve as targets for glycosylation. Chemical deglycosylation 

techniques are known in the art and described, for instance, by Hakimuddin, et al. (1987) 

Arch. Biochem. Biophvs.. 259:52 and by Edge, etal. HOSn Anal. Biochem.. 118:131. 

10 Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the use of a 

variety of endo-and exo-glycosidases as described by Thotakura, et al. (1987) Meth. 

EnzvmoL. 138:350. 

Another type of covalent modification of lung cancer comprises linking the lung 
cancer polypeptide to one of a variety of nonproteinaceous polymers, e.g., polyethylene 

15 glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth in U.S. Patent 
Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192, or 4,179,337. 

Limg cancer polypeptides of the present invention may also be modified in a way to 
form chimeric molecules comprising a lung cancer polypeptide fiised to another, 
heterologous polypeptide or amino acid sequence. In one embodiment, such a chimeric 

20 molecule comprises a fiision of a lung cancer polypeptide with a tag polypeptide which 
provides an epitope to which an anti-tag antibody can selectively bind. The epitope tag is 
generally placed at the amino-or carboxyl-teraiinus of the lung cancer polypeptide. The 
presence of such epitope-tagged forms of a lung cancer polypeptide can be detected using an 
antibody against the tag polypeptide. Also, provision of the epitope tag enables the lung 

25 cancer polypeptide to be readily purified by affinity purification using an anti-tag antibody or 
another type of affinity matrix that binds to the epitope tag. In an alternative embodiment, 
the chimeric molecule may comprise a fiision of a lung cancer polypeptide wifli an 
immunoglobulin or a particular region of an immunoglobulin. For a bivalent form of the 
chimeric molecule, such a fijsion could be to the Fc region of an IgG molecule. 

30 Various tag polypeptides and their respective antibodies are well known and examples 

include poly-histidine (poly-his) or poly-histidine-glycine (poly-his-gly) tags; HIS6 and metal 
chelation tags, the flu HA tag polypeptide and its antibody 12CA5 0?ield, et al. (1988) MoL 
Cell Biol. 8:2159-2165); the c-myc tag and the 8F9, 3C7, 6E10, G4, B7 and 9E10 antibodies 
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thereto (Evan, et al. (1985) Molecular and Cellular Bioloev 5:3610-3616); and the H^apes 

Simplex virus glycoprotein D (gD) tag and its antibody (Paborsky, et al. (1990) Protein 

Engineering 3(6):547-553). Other tag polypq)tides include the Flag-peptide (Hopp, et al. 

(1988) BioTechnology 6:1204-1210); the KT3 epitope peptide (Martm, et al. (1992) Science 

5 255:192-194); tubulin epitope peptide (Skinner, et al. (1991) J. Biol. Chem. 266:15163- 
15166); and the T7 gene 10 protein peptide tag (Lutz-Freyennuth, et al. (1990) Proc. Naf 1 
Acad. Sci. USA 87:6393-6397). 

Also included are other lung cancer proteins of the lung cancer family, and lung 
cancer proteins fiom other organisms, which are cloned and expressed as outlmed below. 

10 Thus, probe or degenerate polymerase chain reaction (PGR) primer sequences may be used to 
find other related lung cancer proteins fiom primates or ofh&c organisms. As will be 
^predated by those in the art, particularly usefiil probe and/or PGR primer sequences 
include unique areas of the lung cancer nucleic acid sequence. As is generally known in the 
art, preferred PGR primers are &om about 15 to about 35 nucleotides in length, with from 

15 about 20 to about 30 being preferred, and may contain inosine as needed. PGR reaction 
conditions are well known in the art (e.g., Itmis, PGR Protocols, supra). 

Antibodies to lung cancer proteins 

In a preferred embodiment, when a lung cancer protem is to be used to generate 
20 antibodies, e.g., for immunotherapy or immunodiagnosis, the lung cancer protein should 
share at least one epitope or determinant with the fall length protein. By "epitope" or 
"determinant" herein is typically meant a portion of a protein which will generate and/or bind 
an antibody or T-cell receptor in the context of MHG. Thus, in most instances, antibodies 
made to a smaller lung cancer protein will be able to bind to the fall-length protein, 
25 particularly linear epitopes. In a preferred embodimmt, the epitope is unique; that is, 
antibodies generated to a unique epitope show little or no cross-reactivity. 

Methods of preparing polyclonal antibodies are well known (e.g., Goligan, supra ; and 
Harlow and Lane, supra). Polyclonal antibodies can be raised in a mammal, e.g., by one or 
more injections of an immunizing agent and, if desired, an adjuvant. Typically, the 
30 immunizing agent and/or adjuvant will be injected m the mammal by multiple subcutaneous 
or intraperitoneal injections. The immunizing agent may include a protein encoded by a 
nucleic acid of Tables 1 A-16 or fragment thereof or a fiision protein thereof. It may be use&l 
to conjugate the immimizing agent to a protein known to be immxmogenic in the mammal 
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being immunized. Immunogenic proteins include, e.g., keyhole limpet hemocyanin, serum 

albumin, bovine fhyroglobulin, and soybean trypsin inhibitor. Adjuvants include, e.g., 

Freund's complete adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic 

trehalose dicorynomycolate). The immunization protocol may be selected by one skilled in 

S the art 

The antibodies may, alternatively, be monoclonal antibodies. Monoclonal antibodies 
may be prepared using hybridoma methods, such as those described by Kohler and Milstein 
(1975) Nature 256:495. In a hybridoma method, a mouse, hamster, or other appropriate host 
animal, is typically immunized with an immunizing agent to elicit lymphocytes that produce 
10 or are capable of producing antibodies that will specifically bind to the inamunizing agent. 
Alternatively, the lymphocytes may be inmiunized in vitro. The immu nizin g agent will 
typically include a polypeptide encoded by a nucleic acid of the tables, or fragment thereof, 
or a fusion protein thereof. Generally, either peripheral blood lymphocytes ('TBLs") are 
used if cells of human origin are desired, or spleen cells or lymph node cells are used if non- 
15 human mammalian sources are desu^. The lymphocytes are then fused with an 

immortalized cell line using a suitable fusing agent, such as polyethylene glycol, to form a 
hybridoma cell (Coding (1986) Monoclonal Antibodies: Principles and Practice, pp. 59-103 ). 
Immortalized cell lines are usually transformed mammalian cells, particularly myeloma cells 
of rodent, bovin, or primate origirL Usually, rat or mouse myeloma cell lines are employed. 
20 The hybridoma cells may be cultured in a suitable culture medium that preferably contains 
one or more substances that inhibit the growth or survival of the unfused, immortalized cells. 
For example, if the parental cells lack the enzyme hypoxanthine guanine phosphoribosyl 
transferase (HGPRT or HPRT), the culture medium for the hybridomas typically will include 
hypoxanthine, aminopterin, and thymidine ("HAT medium"), which substances prevent the 
25 growth of HGPRT-deficient cells. 

In one embodiment, the antibodies are bispecific antibodies. Bispecific antibodies are 
typically monoclonal, preferably human or humanized, antibodies that have binding 
specificities for at least two different antigens or that have binding specificities for two 
epitopes on the same antigen. In one embodiment, one of the binding specificities is for a 
30 protein encoded by a nucleic acid of the tables or a fragment thereof the other one is for any 
other antigen, and preferably for a cell-surface protein or receptor or receptor subunit, 
preferably one that is tumor specific. Alternatively, tetramer-type technology may create 
multivalent reagents. 
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In a prefeited embodimoit, the antibodies to lung cancer protein are capable of 
reducing or eliminating a biological function of a lung cancer protein, in a naked form or 
conjugated to an effector moiety. That is, the addition of anti-lung cancer protein antibodies 
(either polyclonal or preferably monoclonal) to lung cancer tissue (or cells containing lung 
5 cancer) may reduce or eliminate the lung cancer. Generally, at least a 25% decrease in 
activity, growth, size or the like is preferred, with at least about 50% being particularly 
preferred and about a 95-100% decrease being especially preferred. 

In a preferred embodiment the antibodies to the lung cancer proteins are humanized 
antibodies (e.g., Xenerex Biosciraces, Medarex, Inc., Abgenix, Inc., Protein Design Labs, 

10 Inc.) Humanized forms of non-human (e.g.^ murine) antibodies are chimeric molecules of 
inununoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', 
F(ab')2 or other antigen-binding subsequences of antibodies) which contain minimal 
sequence derived from non-human immunoglobulin. Humanized antibodies include human 
immunoglobulins (recipient antibody) in which residues from a complementary determining 

15 region (CDR) of the recipient are replaced by residues from a CDR of a non-human species 
(donor antibody) such as mouse, rat or rabbit having the desired specificity, afBnity and 
capacity. In some instances, Fv framework residues of a human inununoglobulin are 
replaced by corresponding non-human residues. Humanized antibodies may also comprise 
residues which are found neither in the recipient antibody nor in the imported CDR or 

20 framework sequences. In general, a humanized antibody will comprise substantially all of at 
least one, and typically two, variable domains, in which all or substantially all of the CDR 
regions correspond to those of a non-human immunoglobulin and all or substantially all of 
the framework (FR) regions are those of a human immunoglobidin consensus sequence. A 
humanized antibody optimally also will typically comprise at least a portion of an 

25 immunoglobulin constant region (Fc), typically that of a human immunoglobulin (Jones, et 
al. (1986) Nature 321 :522-525; Riechmann, et al. (1988) Nature 332:323-329; and Presta . 
(1992) Curr. Op. Struct. Biol. 2:593-596). Humanization can be performed following the 
method of Winter and co-workera (Jones, et aL (1986) Nature 321 :522-525; Riechmann, et al. 
(1988) Nature 332:323-327; Verhoeyen, et al. (1988) Science 239:1534-1536), by 

30 substituting rodent CDRs or CDR sequences for corresponding sequences of a hiunan 

antibody. Accordingly, such humanized antibodies are chimeric antibodies (U.S. Patent No. 
4,816,567), wherein substantially less than an intact human variable domain has been 
substituted by corresponding sequence from a non-human species. 
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Human-like antibodies can also be produced using various techniques known in the 

art, including phage display libraries (Hoogenboom and Winter (1991) J. Mol. BioL 227:381; 

Marks, et al. (1991) J. Mol. Biol. 222:581). The techniques of Cole, et al. and Boemer, et al. 

are also available for the preparation of hxnnan monoclonal antibodies (Cole, et al. (1985) 

5 Monoclonal Antibodies and Cancer Therapy, p. 77 and Boemer, et al. (1991) J. ImmunoL 

147(l):86-95). Similarly, human antibodies can be made by introducing human 

inmiunoglobulin loci into transgenic animals, e.g., mice in which the endogmous 

immunoglobulin genes have been partially or completely inactivated. Upon challenge, 

human antibody production is observed, which closely resembles that seen in humans in 

10 nearly all respects, including gaie rearrangement, assembly, and antibody repertoire. This 
approach is described, e.g., in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and m the following scientific publications: Marks, et al. (1992) 
Bio/Technology 10:779-783; Lonberg, et al. (1994) Nature 368:856-859; Morrison (1994) 
Nature 368:812-13; Fishwild, et al. (1996) Nature Biotechnoloev 14:845-51; Neuberger 

15 (1996) Nature Biotechnology 14:826; and Lonberg and Huszar (1995) Intern. Rev. Immunol. 
13:65-93. 

By immunotherapy is meant treatment of lung cancer with an antibody raised against 
a Ixmg cancer proteins. As used herein, immunotherapy can be passive or active. Passive 
immunotherapy as defined herein is the passive transfer of antibody to a recipient (patient). 

20 Active immunization is the induction of antibody and/or T-cell responses in a recipient 
(patient). Induction of an immune response is the result of providing the recipient with an 
antigen to which antibodies are raised. The antigen may be provided by injecting a 
polypeptide against which antibodies are desired to be raised into a recipient, or contacting 
the recipient with a nucleic acid capable of expressing the antigen and under conditions for 

25 expression of the antigen, leading to an immune response. 

In a preferred embodiment the lung cancer proteins against which antibodies are 
raised are secreted protems as described above. Without being bound by theory, antibodies 
used for treatment, may bind and prevent the secreted protein firom binding to its receptor, 
thereby inactivating the secreted lung cancer protein. 

30 In another preferred embodiment, the Ixmg cancer protein to which antibodies are 

raised is a transmembrane protein. Without being bound by theory, antibodies used for 
treatment may bind the extracellular domain of the Ixmg cancer protein and prevent it from 
binding to other proteins, such as circulating ligands or cell-associated molecules. The 
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antibody may cause down-regulation of the transmembrane limg cancer protein. The 

antibody may be a competitive, non-competitive or uncompetitive inhibitor of protein binding 

to the extracellular domain of the limg cancer protein. The antibody may be an antagonist of 

the lung cancer protein or may prevent activation of a transmembrane lung cancer protein, or 

5 may induce or suppress a particular cellular pathway. In some embodiments, when the 

antibody prevents the binding of other molecules to the lung cancer protein, the antibody 

prevents growth of the cell. The antibody may also be used to target or sensitize the cell to 

cytotoxic agents, including, but not limited to TNF-a, TNF-p, IL-1, INF-y, and IL-2, or 

chemotherapeutic agents including SFU, vinblastine, actmomycin D, cisplatin, methotrexate, 

10 and the like. In some instances the antibody may belong to a sub-type that activates serum 
complement when complexed with the transmembrane protein thereby mediating cytotoxicity 
or antigen-dependent cytotoxicity (ADCC). Thus, lung cancer may be treated by 
admmistering to a patient antibodies directed against the transmembrane lung cancer protein. 
Antibody-labeling may activate a co-toxin, localize a toxin payload, or otherwise provide 

IS means to locally ablate cells. 

In another preferred embodiment, the antibody is conjugated to an effector moiety. 
The effector moiety can be various molecules, including labeling moieties such as radioactive 
labels or fluorescent labels, or can be a therapeutic moiety. In one aspect the therapeutic 
moiety is a small molecule that modulates die activity of a lung cancer protein. In another 

20 aspect the therapeutic moiety may modulate an activity of molecules associated with or in 
close proximity to a lung cancer protein. The therapeutic moiety may inhibit enzymatic or 
signaling activity such as protease or collagenase activity associated with lung cancer. 

In a preferred embodiment, the therapeutic moiety can also be a cytotoxic agent. In 
this method, targeting the cytotoxic agent to lung cancer tissue or cells results in a reduction 

25 in the number of afflicted cells, thereby reducing symptoms associated with lung cancer. 

Cytotoxic agents are numerous and varied and include, but are not limited to, cytotoxic drugs 
or toxins or active fragments of such toxins. Suitable toxins and their corresponding 
fragments include diphtheria A chain, exotoxin A chain, ricin A chain, abrin A chain, curcin, 
crotin, phenomycin, enomycin, saporin, auristatin, and the like. Cytotoxic agents also include 

30 radiochemicals made by conjugating radioisotopes to antibodies raised against lung cancer 
proteins, or binding of a radionuclide to a chelatuig zg&ai that has been covalently attached to. 
the antibody. Targeting the therapeutic moiety to transmembrane lung cancer proteins not 
only serves to increase the local concentration of therapeutic moiety in the lung cancer 
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afOicted area, but also serves to reduce deleterious side effects that may be associated with 

the untargeted therapeutic moiety. 

In another preferred embodiment, the lung cancer protein against which the antibodies 

are raised is an intracellular protein. In this case, the antibody may be conjugated to a protein- 

5 or other entity which facilitates entry into the cell. In one case, the antibody enters the cell by 

endocytosis. In another embodiment, a nucleic acid encoding the antibody is administered to 

the individual or cell. Moreover, wherein the lung cancer protein can be targeted within a 

cell, i.e., the nucleus, an antibody theretomay contain a signal for that target localization, i.e., 

a nuclear localization signal. 

10 The lung cancer antibodies of the invention specifically bind to lung cancer proteins. 

By "specifically bind" herein is meant that the antibodies bind to the protein with a of at 

least about 0.1 mM, more usually at least about 1 ^M, preferably at least about 0.1 pM or 

better, and most preferably, 0.01 iM or better. Selectivity of binding to the specific target 

and not to related other sequences is also important. 

15 

Detection of lung cancer sequence for diagnostic and therapeutic applications 

In one aspect, the RNA expression levels of genes are determined for different 
cellular states in the lung cancer phenotype. Expression levels of genes in normal tissue (e.g., 
not undergoing lung cancer), in lung cancer tissue (and in some cases, for varying severities 

20 of lung cancer that relate to prognosis, as outlined below), or in non-malignant disease are 
evaluated to provide expression profiles. A gene expression profile of a particular cell state 
or point of development is essentially a "fingerprinf * of the state of the cell. While two states 
may have a particular gene similarly expressed, the evaluation of a number of genes 
simultaneously allows the generation of a gene expression profile that is reflective of the state 

25 of the cell. By comparing expression profiles of cells in different states, information 

regarding which genes are important (including both up- and down-regulation of genes) in 
each of these states is obtained. Then, diagnosis may be performed or confirmed to 
determine whether a tissue sample has the gene expression profile of normal or cancerous 
tissue. This will provide for molecular diagnosis of related conditions. 

30 ^"Differential expression," or grammatical equivalents as used herein, refers to 

qualitative or quantitative differences in the temporal and/or cellular gene expression 
patterns within and among cells and tissue. Thus, a differentially expressed gene can 
qualitatively have its expression altered, including an activation or inactivation, in, e.g.. 
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nonnal versus lung cancer tissue. Genes may be turned on or turned off in a particular state, 

relative to anotiier state thus permitting comparison of two or more states. A qualitatively 

regulated gene will exhibit an expression pattern within a state or cell type which is 

detectable by standard techniques. Some genes will be expressed in one state or cell type, but 

5 not in both. Alternatively, the difference in expression may be quantitative, e.g., in that 

expression is increased or decreased; i.e., gene expression is either upregulated, resulting in 

an increased amount of transcript, or downregulated, resulting in a decreased amount of 

transcript The degree to which expression differs need only be large enough to quantify via 

standard characterization techniques as outlined below, such as by use of Afiymetrix 

10 GCTLcChip™ expression arrays, Lockhart (1996) Nature Biotechnology 14:1675-1680, hereby 
expressly incorporated by reference. Other techniques include, but are not limited to, 
quantitative reverse transcriptase PGR, northern analysis and RNase protection. As outlined 
above, preferably the change in expression (i.e., upregulation or downregulation) is typically 
at least about 50%, more preferably at least about 100%, more preferably at least about 

15 150%, more preferably at least about 200%, with from 300 to at least 1000% being especially 
preferred. 

Evaluation may be at the gene transcript or the protein level. The amount of gene 
expression may be monitored using nucleic acid probes to the RNA or DNA equivalent of the 
gene transcript, and the quantification of gene expression levels, or, alternatively, the final 

20 gene product itself (protein) can be monitored, e.g., with antibodies to the lung cancer protein 
and standard immunoassays (ELISAs, etc.) or other techniques, including mass spectroscopy 
assays, 2D gel electrophoresis assays, etc. Proteins corresponding to lung cancer genes, e.g., 
those identified as being important in a lung cancer or disease phenotype, can be evaluated in 
a lung cancer diagnostic test In a preferred embodiment, gene expression monitoring is 

25 p^ormed simultaneously on a number of genes. 

The lung cancer nucleic acid probes may be attached to biochips as outlined herein for 
the detection and quantification of lung cancer sequences in a particular cell. The assays are 
further described below in the example. PGR techniques can be used to provide greater 
sensitivity. Multiple protein expression monitoring can be performed as well. Similarly, 

30 fliese assays may be performed on an individual basis as well. 

In a preferred embodiment nucleic acids encoding the lung cancer protein are 
detected. Although DNA or RNA encoding the lung cancer protein may be detected, of 
particular interest are methods wherein an mICNA encoding a lung cancer protein is detected. 
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Probes to detect mRNA can be a nucleotide/deoxynucleotide probe that is complementaiy to 

and hybridizes with the mRNA and includes, but is not limited to, oligonucleotides, cDNA or 

KNFA. Probes also should contain a detectable label, as defined herein. In one method the 

mRNA is detected after inomobilizing the nucleic acid to be examined on a solid support such 

S as nylon membranes and hybridizing the probe with the sample. Following washing to 

remove the non-specifically bound probe, the label is detected. In another method detection 

of the mRNA is performed in situ. In this method permeabilized cells or tissue samples are 

contacted with a detectably labeled nucleic acid probe for sufficient time to allow the probe 

to hybridize with the target mRNA. Following washing to remove the non-specifically bound 

10 probe, the label is detected. For example a digoxygenin labeled riboprobe (RNA probe) that 
is complementary to the mRNA encoding a lung cancer protein is detected by binding the 
digoxygenin with an anti-digoxygenin secondary antibody and developed with nitro blue 
tetrazolium and 5-bromo-4-chloro-3-indoyl phosphate. 

In a preferred embodiment, various proteins Scorn the three classds of proteins as 

1 S described herein (secreted, transmembrane or intracellular proteins) are used in diagnostic 
assays. The lung cancer proteins, antibodies, nucleic acids, modified proteins and cells 
containing lung cancer sequences are used in diagnostic assays. This can be performed on an 
individual gene or corresponding polypeptide level. In a preferred embodiment, the 
expression profiles are used, preferably in conjunction with high throughput screening 

20 techniques to allow monitoring for expression profile genes and/or corresponding 
polypeptides. 

As described and defined herein, lung cancer proteins, including intracellular, 
transmembrane, or secreted proteins, find use as markers of lung cancer, e.g., for prognostic 
or diagnostic purposes. Detection of these proteins in putative lung cancer tissue allows for 

25 detection, prognosis, or diagnosis of lung cancer or similar disease, and perhaps for selection 
of ther^eutic strategy. In one embodiment, antibodies are used to detect lung cancer 
proteins. A preferred method separates proteins firom a sample by electrophoresis on a gel 
(typically a denaturing and reducing protein gel, but may be another type of gel, including 
isoelectric focusing gels and the like). Following sq)aration of proteins, the lung cancer 

30 protein is detected, e.g., by immunoblotting with antibodies raised against the lung cancer 
protein. Methods of immunoblotting are well known to those of ordinary skill in the art. 

In another prefeired method, antibodies to the lung cancer protein find use in in situ 
imaging techniques, e.g., in histology (e.g., Asai (ed. 1993) Methods in Cell Biology: 
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Antibodies in Cell Biology, volume 37. In this method cells are contacted with fiom one to 

many antibodies to the Imig cancer protein(s). Following washing to remove non-specific 

antibody binding, the presence of the antibody or antibodies is detected. In one embodiment 

the antibody is detected by incubating with a secondary antibody that contains a detectable 

S label, e.g., multicolor fluorescence or confocal imaging, hi another method the primary 

antibody to the lung cancer pn)tein(s) contains a detectable label, e.g., an enzyme marker that 

can act on a substrate. In another preferred ^bodiment each one of multiple primary 

antibodies contains a distinct and detectable label. This method finds particular use in 

simultaneous screening for a plurality of limg cancer proteins. Many other histological 

1 0 imaging techniques are also provided by the invention. 

In a preferred embodiment the label is detected in a fluorometer which has the ability 

to detect and distinguish emissions of different wavelengths. In addition, a fluorescence 

activated cell sorter (FACS) can be used in the method. 

In another preferred embodiment, antibodies find use in diagnosing lung cancer from 

IS blood, serum, plasma, stool, and other samples. Such samples, therefore, are usefiil as 

samples to be probed or tested for the presence of lung cancer proteins. Antibodies can be 

used to detect a lung cancer protein by previously described immunoassay techniques 

including ELISA, immunoblotting (westem blotting), immunoprecipitation, BIACORE 

technology and the like. Conversely, the presence of antibodies may indicate an immune 

20 response against an endogenous lung cancer protein or vaccine. 

In a preferred embodiment, in situ hybridization of labeled lung cancer nucleic acid 

probes to tissue arrays is done. For example, arrays of tissue samples, including lung cancer 

tissue and/or normal tissue, are made. In situ hybridization (see, e.g., Ausubel, supra) is then 

performed. When comparing the fingerprints between an individual and a standard, the 

25 skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It is 

fiirther understood that the genes which indicate the diagnosis may differ firom those which 

indicate the prognosis and molecular profiling of the condition of the ceUs may lead to 

distinctions between responsive or refiractoiy conditions or may be predictive of outcomes. 

In a preferred embodiment, the lung cancer proteins, antibodies, nucleic acids, 

30 modified proteins and cells containing lung cancer sequences are used in prognosis assays. 

As above, gene expression profiles can be generated that correlate to lung cancer, clinical, 

pathological, or other information, in terms of long term prognosis. Again, this may be done 

on either a protein or gene level, with the use of genes being preferred. Single or multiple 
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genes may be useful in various combinations. As above, lung cancer probes may be attached 

to biochips for fhe detection and quantification of lung cancer sequences in a tissue or patient. 

The assays proceed as outlined above for diagnosis. PGR method may provide more 

sensitive and accurate quantification. 

5 

Assays for therapeutic compounds 

In a preferred embodiment, the proteins, nucleic acids, and antibodies as described 
herein are used in drug screening assays. The lung cancer proteins, antibodies, nucleic acids, 
modified proteins and cells containing lung cancer sequences are used in drug screening 

10 assays or by evaluating the effect of drug candidates on a **gene expression profile" or 
expression profile of polypeptides. In a preferred embodiment, the expression profiles are 
used, preferably in conjunction with high throughput screening techniques to allow 
monitoring for expression profile genes after treatment with a candidate agent (e.g., 
Zlokamik, et al. (1998) Science 279:84-8; Heid (1996) Genome Res. 6:986-94. 

IS In a preferred embodiment, the lung cancer proteins, antibodies, nucleic acids, 

modified proteins and cells containing the native or modified lung cancer proteins are used in 
screening assays. That is, the present invention provides novel methods for screening for 
compositions which modulate the lung cancer phenotype or an identified physiological 
fimction of a lung cancer proteirL As above, this can be done on an individual gene level or 

20 by evaluating the effect of drug candidates on a "gene expression profile". In a preferred 

embodiment, the expression profiles are used, preferably in conjunction with high throughput 
screening techniques to allow monitoring for expression profile genes after treatment with a 
candidate agent, see Zlokamik, supra. 

Having identified differentially expressed genes herein, a variety of assays may be 

25 performed. In a preferred embodiment, assays may be run on an individual gene or protein 
level. That is, having identified a particular gene with altered regulation in lung cancer, test 
compounds can be screened for the ability to modulate gene expression or for binding to the 
limg cancer proteiiL "Modulation" thus includes an uicrease or a decrease in gene 
expression. The preferred amoimt of modulation will depend on the original change of the 

30 gene expression in normal versus tissue undergoing lung cancer, with changes of at least 
10%, preferably 50%, more preferably 100-300%, and in some embodiments 300-1000% or 
greater. Thus, if a gene exhibits a 4-fold increase in lung cancer tissue compared to normal 
tissue, a decrease of about four-fold is often desired; similarly, a 10-fold decrease in lung 
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cancer tissue compared to normal tissue often provides a target value of a 10-fold increase in 
expression to be induced by the test compound. 

The amount of gene expression may be monitored using nucleic acid probes and the 
quantification of gene expression levels, or, alternatively, the gene product itself can be 
5 monitored, e.g., through the use of antibodies to the lung cancer protein and standard 
immunoassays. Proteomics and sq>aration techniques may also allow quantification of 
expression. 

In a preferred embodiment, gene or protein expression monitoring of a number of 
entities, i.e., an e3q)ression profile, is monitored simultaneously. Such profiles will typically 
1 0 involve a plurality of those entities described herein. 

In this embodiment, the lung cancer nucleic acid probes are attached to biochips as 
outlined herein for the detection and quantification of lung cancer sequences in a particular 
cell. Altematively,PCRmaybeused. Thus, a series, e.g., ofmicrotiter plate, may be used 
with dispensed primers in desired wells. A PGR reaction can then be performed and analyzed 
15 for each well. 

Expression monitoring can be performed to identify compounds that modify the 
expression of one or more lung cancer-associated sequences, e.g., a polynucleotide sequence 
set out in the tables. Generally, in a preferred embodiment, a test compound is added to the 
cells prior to analysis. Moreover, screens are also provided to identify agents that modulate 

20 lung cancer, modulate lung cancer proteins, bind to a lung cancer protein, or interfere with 
the binding of a lung cancer protein and an antibody, substrate, or other binding partner. 

The term **test compound" or "drug candidate" or **modulator" or grammatical 
equivalents as used herein describes a molecule, e.g.. protein, oligopeptide, small organic 
molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or 

25 indirecfly alter the lung cancer phenotype or the expression of a lung cancer sequence, e.g., a 
nucleic acid or protein sequence. In preferred embodiments, modulators alter expression 
profiles of nucleic acids or proteins provided herein. In one embodiment, the modulator 
suppresses a lung cancer phenotype, e.g., to a normal or non-malignant tissue fingerprint. In 
another embodiment, a modulator induces a lung cancer phenotype. Generally, a plurality of 

30 assay mixtures are run in parallel with different agent concentrations to obtain a difTerential 
' response to the various concentrations. Typically, one of these concentrations serves as a 
negative control, i.e., at zero concentration or below the level of detectioiL 
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In one aspect, a modulator will neutralize the efifect of a lung cancer proteiiL By 

"'neutralize" is meant that activity of a protein and the consequent efifect on the cell is 

inhibited or blocked 

In certain embodiments, combinatorial libraries of potential modulators will be 

5 screened for an ability to bind to a lung cancer polypeptide or to modulate activity. 

Conventionally, new chemical entities with useful properties are generated by identifying a 

chemical compound (called a "lead compound'*) with some desirable property or activity, 

e.g., inhibiting activity, creating variants of the lead compound, and evaluating the property 

and activity of those variant compounds. Often, high throughput screening (HTS) methods 

10 are employed for such an analysis. 

In one preferred embodiment, high throughput screening methods involve providing a 
library containing a large number of potential therapeutic compounds (candidate 
compounds). Such '"combinatorial chemical libraries" are then screened in one or more 
assays to identify those library members particular chemical species or subclasses) diat 

IS display a desired characteristic activity. The compounds thus identified can serve as 

conventional "lead compounds" or can themselves be used as potential or actual ther^eutics. 

A combinatorial chemical library is a collection of diverse chemical confounds 
generated by either chemical synthesis or biological synthesis by combining a number of 
chemical "building blocks" such as reagents. For example, a linear combinatorial chemical 

JO library, such as a polypeptide (e.g., mutein) library, is formed by combining a set of chemical 
building blocks called amino acids in every possible way for a given compound length (i.e., 
the number of amino acids in a polypeptide compound). Millions of chemical compounds 
can be synthesized through such combinatorial mixing of chemical building blocks (Gallop, 
et al. (1994) J. Med. Chem. 37(9):1233-1251). 

15 Preparation and screening of combinatorial chemical libraries is well known to those 

of skill in the art. Such combuiatorial chemical libraries include, but are not limited to, 
peptide libraries (see, e.g., U.S. Patent No. 5,010.175, Furka (1991) Pept. Prot. Res, 37:487- 
493, Houghton, et al. (1991) Nature. 354:84-88), peptoids (PCX PubUcation No WO 
91/19735), encoded peptides (PCX Publication WO 93/20242), random bio-oUgomers (PCX 

10 Publication WO 92/00091), benzodiazepines (U.S. Pat No. 5,288,5 14), diversomers such as 
hydantoins, benzodiazepines and dipeptides (Hobbs, et al. (1993) Proc. Nat Acad. Sci. USA 
90:6909-6913), vinylogous polypq)tides (Hagihara, et al. (1992) J. Amer. Chem. Soc. 
1 14:6568), nonpeptidal peptidomimetics with a Beta-D-Glucose scaffolding (Hirschmaim, et 
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aL (1992) J. Amer. Chem. Soc. 114:9217-9218), analogous organic syntiieses of small 
compound libraries (Chen, et al. (1994) J. Amer. Chem. Soc. 116:2661), oligocaibamates 
(Cho, et al (1993) Science 261:1303), and/or peptidyl phosphonates (Campbell, et al. (1994) 
J. Org. Chem. 59:658). See, generally, Ciordon, et al. (1994) J. Med. Chem. 37:1385, nucleic 
5 acid libraries (see, e.g., Stratagene, Corp.), peptide nucleic acid libraries (see, e.g., U.S. 
Patent 5,539,083), antibody libraries (see, e.g., Vaughn, et al. (1996) Nature Biotechnology 
14(3):309-314, and PCT/US96/10287), carbohydrate libraries (see, e.g., Liang, et al. (1996) 
Science 274:1520-1522, and U.S. Patent No. 5,593,853), and small organic molecule libraries 
(see, e.g., benzodiazepines, Baum (1993) C&EN, Jan 18, page 33; isoprenoids, U.S. Patent 

10 No. 5,569,588; thiazolidinones and metathiazanones, U.S. Patent No. 5,549,974; pyrrolidines, 
U.S. Patent Nos. 5,525,735 and 5,519,134; morpholino compounds, U.S. Patent No. 
5,506,337; benzodiazepines, U.S. Patent No. 5,288,514; and the like). . 

Devices for the preparation of combinatorial libraries are commercially available (see, 
e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Symphony, Rainin, 

15 Wobum, MA, 433A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, Bedford, 
MA). 

A number of well known robotic systems have also been developed for solution phase 
chemistries. These systems include automated workstations like the automated synthesis 
apparatus developed by Takeda Chemical Industries, LTD. (Osaka, Japan) and many robotic 

20 systems utilizing robotic arms (Zymate H, Zymark Corporation, Hopkinton, Mass.; Orca, 
Hewlett-Packard, Palo Alto, Calif.), which mimic the manual synthetic operations perfomied 
by a chemist. The above devices, with appropriate modification, are suitable for use with the 
present invention. In addition, numerous combinatorial libraries are themselves 
commercially available (see, e.g., ComGenex, Princeton, N.J., Asinex, Moscow, Ru, Tripos, 

25 Inc., St. Louis, MO, ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek 
Biosciences, Columbia, MD, etc.). 

The assays to identify modulators are amenable to high throughput screening. 
Preferred assays thus detect modulation of lung cancer gene transcription, polypeptide 
expression, and polypeptide activity. 

30 High throughput assays for evaluating the presence, absence, quantification, or other 

properties of particular nucleic acids or protein products are well known to those of skill in 
the art. Similarly, binding assays and reporter gene assays are similarly well known. Thus, 
e.g., U.S. Patent No. 5,559,410 discloses high throughput screening methods for proteins. 
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U.S. Patent No. 5,585,639 discloses high throughput screening methods for nucleic acid 

binding (Le., in arrays), while U.S. Patent Nos. 5,576,220 and 5,541,061 disclose high 

throughput methods of screening for ligand/antibody binding. 

In addition, high throughput screening systems are commercially available (see, e.g., 

5 Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 

Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). These systems 

typically automate procedures, including sample and reagent pipetting, liquid dispensing, 

timed incubations, and final readings of the microplate in detector(s) appropriate for the 

assay. These configurable systems provide high throughput and rapid start up as well as a 

10 high degree of flexibility and customization. The manufacturers of such systems provide 
detailed protocols for various high throughput systems. Thus, e.g., Zymark Corp. provides 
technical bulletins describing screening systems for detecting the modulation of gene 
transcription, ligand binding, and the like. 

In one embodiment, modulators are proteins, often naturally occurring proteins or 

15 fi-agments of naturally occurring proteins. Thus, e.g., cellular extracts containing proteins, or 
random or directed digests of proteinaceous cellular extracts, may be used. In this way 
libraries of proteins may be made for screening in the methods of the invention. Particularly 
preferred in this embodiment are libraries of bacterial, fimgal, viral, and mammalian proteins, 
with the latter being preferred, and human proteins being especially preferred. Particularly 

20 usefiil test compoimd will be directed to the class of proteins to which the target belongs, e.g., 
substrates for enzymes or ligands and receptors. 

In a preferred embodiment, modulators are peptides of fi-om about 5 to about 30 
amino acids, with fi-om about 5 to about 20 amino acids being preferred, and fi-om about 7 to 
about 15 being particularly preferred. The peptides may be digests of naturally occurring 

25 proteins, random peptides, or ^'biased" random peptides. By "randomized" or grammatical 
equivalents herein is meant that the nucleic acid or peptide consists of essentially random 
sequences of nucleotides and amino acids, respectively. Since these random peptides (or 
nucleic acids, discussed below) are often chemically synthesized, they may incorporate a 
nucleotide or amino acid at any position. The synthetic process can be designed to generate 

30 randomized proteins or nucleic acids, to allow the formation of all or most of the possible 

combinations over the length of the sequence, thus forming a library of randomized candidate 
bioactive proteinaceous agents. 
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In one embodiment, the library is fiiUy randomized, with no sequence preferences or 

constants at any position. In a preferred embodiment, the Ubrary is biased. That is, some 

positions within the sequence are either held constant, or are selected from a limited number 

of possibilities. In a preferred embodiment, the nucleotides or amino acid residues are 

5 randomized within a defined class, e.g., of hydrophobic amino acids, hydrophiUc residues, 

sterically biased (either small or large) residues, towards the creation of nucleic acid binding 

domains, the creation of cysteines, for cross-linking, prolines for SH-3 domains, serines, 

threonines, tyrosines or histidines for phosphorylation sites, etc. 

Modulators of limg cancer can also be nucleic acids, as defined above. 

10 As described above generally for proteins, nucleic acid modulating agents may be 

naturally occurring nucleic acids, random nucleic acids, or *l3iased" random nucleic acids. 
Digests of procaryotic or eucaryotic genomes may be used as is outlined above for proteins. 

In a preferred embodiment, the candidate compounds are organic chemical moieties, a 
wide variety of which are available in the literature. 

15 After a candidate agent has been added and the cells allowed to incubate for some 

period of time, the sample containing a target sequence is analyzed. If required, the target 
sequence is prepared using known techniques. For example, the sample may be treated to 
lyse the cells, using known lysis buffers, electroporation, etc., with purification and/or 
amplification such as PGR performed as appropriate. For example, an in vitro transcription 

20 with labels covalently attached to the nucleotides is performed. Generally, the nucleic acids 
are labeled with biotin-FITC or PE, or with cy3 or cy5. 

In a preferred embodiment, the target sequence is labeled with, e.g., a fluorescent, a 
chemiluminescent, a chemical, or a radioactive signal, to provide a means of detecting the 
target sequence's specific binding to a probe. The label also can be an enzyme, such as, 

25 alkaline phosphatase or horseradish peroxidase, which when provided with an appropriate 
substrate produces a product that can be detected. Alternatively, the label can be a labeled 
compound or small molecule, such as an enzyme inhibitor, that binds but is not catalyzed or 
altered by the enzyme. The label also can be a moiety or compound, such as, an epitope tag 
or biotin which specifically binds to streptavidin. For the example of biotin, the streptavidin 

30 is labeled as described above, thereby, providing a detectable signal for the bound target 
sequence. Unbound labeled streptavidin is typically removed prior to analysis. 

Nucleic acid assays can be direct hybridization assays or can comprise "sandwich 
assays", which include the use of multiple probes, as is generally outlined in U.S. Patent Nos. 
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5,681,702, 5,597,909, 5,545,730, 5,594,117, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 

5,591,584, 5,624,802, 5,635,352, 5,594,118, 5^59,100, 5,124,246 and 5,681,697, aU of 

which are hereby incorporated by reference. In this embodiment, in general, the target nucleic 

acid is prepared as outlined above, and then added to the biochip comprising a plurality of 

5 nucleic acid probes, under conditions that aUow the fonnation of a hybridization complex. 

A variety of hybridization conditions may be used in the present invention, including 

high, moderate and low stringency conditions as outlined above. The assays are generally 

run under stringency conditions which allow formation of the label probe hybridization 

complex only in the presence of target. Stringency can be controlled by altering a step 

10 parameter that is a thermodynamic variable, including, but not limited to, temperature, 
formamide concentration, salt concentration, chaotropic salt concentration, pH, organic 
solvent concentration, etc. 

These parameters may also be used to control non-specific binding, as is generally 
outlined m U.S. Patent No. 5,681,697: Thus it may be desirable to perform certain steps at 

15 higher stringency conditions to reduce non-specific binding. 

The reactions outlined herein may be accompUshed in a variety of ways. Components 
of the reaction may be added simultaneously, or sequentially, in diflFerent orders, with 
preferred embodiments outUned below. In addition, the reaction may include a variety of 
other reagents. These include salts, buffers, neutral proteins, e.g., albumm, detergents, etc. 

20 which may be used to facilitate optimal hybridization and detection, and/or reduce non- 
specific or background interactions. Reagents that otherwise improve the efficiency of the 
assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may also be 
used as appropriate, depending on the sample preparation methods and purity of the target. 
The assay data are analyzed to determine the expression levels, and changes in 

25 expression levels as between states, of individual genes, forming a gene expression profile. 

Screens are performed to identify modulators of the lung cancer phenotype. In one 
embodiment, screening is performed to identify modulators that can induce or suppress a 
particular expression profile, thus preferably generating the associated phenotype. In another 
embodiment, e.g., for diagnostic applications, having identified differentially expressed genes 

30 important in a particular state, screens can be performed to identify modulators that alter 

expression of individual genes. In an another embodiment, screening is performed to identify 
modulators that alter a biological function of the expression product of a differentially 
expressed gene. Again, having identified the importance of a gene in a particular state. 
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screens are perfonned to identify agents that bind and/or modulate the biological activity of 

the gene product, or evaluate genetic polymorphisms. 

Genes can be screened for those that are induced in response to a candidate agent 

After identifying a modulator based upon its ability to suppress a lung cancer expression 

5 pattern leading to a normal expression pattern, or to modulate a single lung cancer gene 

expression profile so as to mimic the expression of the gene jfrom normal tissue, a screen as 

described above can be performed to identify genes that are specifically modulated in 

response to the agent. Comparing expression profiles between normal tissue and agent 

treated lung cancer tissue reveals genes that are not expressed in normal tissue or lung cancer 

10 tissue, but are expressed in agent treated tissue. These agent-specific sequences can be 
identified and used by methods described herein for lung cancer genes or proteins. In 
particular these sequences and the proteins they encode find use in marking or identifying 
agent treated cells. In addition, antibodies can be raised against the agent induced proteins 
and used to target novel therapeutics to the treated lung cancer tissue sample. 

15 Thus, in one embodiment, a test compound is administered to a population of lung 

cancer cells, that have an associated lung cancer expression profile. By "administration" or 
"contacting" herein is meant that the candidate agent is added to the cells in such a manner as 
to allow the agent to act upon the cell, whether by uptake and intracellular action, or by 
action at the cell surface. In some embodiments, nucleic acid encoding a proteinaceous 

20 candidate agent (i.e., a peptide) may be put into a viral construct such as an adenoviral or 
retroviral construct, and added to the cell, such that expression of the peptide agent is 
accomplished, e.g., PCT US97/01019. Regulatable gene therapy systems can also be used. 

Once a test compound has been administered to the cells, the cells can be washed if 
desired and are allowed to incubate under preferably physiological conditions for some 

25 period of time. The cells are then harvested and a new gene expression profile is generated, 
as outlined herein. 

Thus, e.g., lung cancer or non-malignant tissue may be screened for agents that 
modulate, e.g., induce or suppress a lung cancer phenotype. A change in at least one gene, 
preferably many, of the expression profile indicates that the agent has an effect on limg 
30 cancer activity. By defining such a signature for the lung cancer phenotype, screens for new 
drugs that alter the phenotype can be devised. With this approach, the drug target need not be 
known and need not be represented in the original expression screening platform, nor does 
the level of transcript for the target protein need to change. 
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Measure of lung cancer polypq)tide activity, or of lung cancer or the lung cancer 

phenotype can be performed using a variety of assays. For example, the effects of the test 

compounds upon the function of the metastatic polypeptides can be measured by examining 

parameters described above. A suitable physiological change that affects activity can be used 

5 to assess the influence of a test compound on the polypeptides of this invention. When the 

functional consequences are determined usiag intact cells or animals, one can also measure a 

variety of effects such as, in the case of lung cancer associated with tumors, tumor growth, 

tumor metastasis, neovascularization, hormone release, transcriptional changes to both known 

and uncharacterized genetic markers (e.g., northern blots), changes in cell metabolism such as 

10 cell growth or pH changes, and changes in intracellular second messengers such as cGMP. In 
the assays of the invention, mammaUan lung cancer polypeptide is typically used, e.g., 
mouse, preferably human. 

Assays to identify compoimds with modulating activity can be performed in vitro. 
For example, a lung cancer polypeptide is first contacted with a potential modulator and 

15 incubated for a suitable amount of time, e.g., fi-om 0,5 to 48 hours, hi one embodiment, the 
lung cancer polypeptide levels are determined in vitro by measuring the level of protein or 
mRNA. The level of protein is typically measured using immunoassays such as western 
blotting, ELIS A and the like with an antibody that selectively binds to the lung cancer 
polypeptide or a fragment thereof. For measurement of mRNA, amplification, e.g., using 

20 PGR, LCR, or hybridization assays, e.g., northern hybridization, RNAse protection, dot 

blotting, are preferred. The level of protein or mRNA is typically detected using directly or 
indirectly labeled detection agents, e.g., fluorescently or radioactively labeled nucleic acids, 
radioactively or enzymatically labeled antibodies, and the like, as described herein. 

Alternatively, a reporter gene system can be devised using a lung cancer protein 

25 promoter operably linked to a reporter gene such as luciferase, green fluorescent protein, 
CAT, or p-gal. The reporter construct is typically transfected into a cell. After treatment 
with a potential modulator, the amount of reporter gene transcription, translation, or activity 
is measured according to standard techniques known to those of skill in the art. 

In a preferred embodiment, as outlined above, screens may be done on individual 

30 genes and gene products (proteins). That is, having identified a particular differentially 

expressed gene as important in a particular state, screening of modulators of the expression of 
the gene or the gene product itself can be done. The gene products of differentially expressed 
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genes are sometimes referred to herein as "lung cancer proteins." The lung cancer protein 

may be a fiagment, or alternatively, be the fiill length protein to a fragment shown herein. 

In one embodiment, screening for modulators of expression of specific genes is 
performed. Typically, the expression of only one or a few genes are evaluated. In another 
5 embodiment, screens are designed to first find compounds that bind to differentially 
expressed proteins. These compounds are then evaluated for the ability to modulate 
differentially expressed activity. Moreover, once initial candidate compounds are identified, 
variants can be fiirther screened to better evaluate structure activity relationships. 

In a preferred embodiment, binding assays are done. In general, purified or isolated 
10 gene product is used; that is, the gene products of one or more differentially expressed 

nucleic acids are made. For example, antibodies are generated to the protein gene products, 
and standard hmnunoassays are run to determine the amount of protein present. Altematively, 
cells comprising the lung cancer proteins can be used in the assays. 

Thus, in a preferred embodiment, the methods comprise combining a limg cancer 
15 protein and a candidate compound, and determining the binding of the compound to the lung 
cancer protein. Preferred embodiments utilize the human lung cancer protein, although other 
mammalian proteins may also be used, e.g., for the development of animal models of human 
disease. In some embodiments, as outlined herein, variant or derivative lung cancer proteins 
may be used. 

20 Generally, in a preferred embodiment of the methods herein, the lung cancer protein 

or the candidate agent is non-diffiisably bound to an insoluble support, preferably having 
isolated sample receiving areas (e.g., a microtiter plate, an array, etc.). The insoluble 
supports may be made of a composition to which the compositions can be bound, is readily 
separated fi-om soluble material, and is otherwise compatible with the overall method of 

25 screening. The surface of such supports may be soUd or porous and of a convenient shape. 
Examples of suitable insoluble supports include microtiter plates, arrays, membranes and 
beads. These are typically made of glass, plastic (e.g., polystyrene), polysaccharides, nylon 
or nitrocellulose, teflon™, etc. Microtiter plates and arrays are especially convenient because 
a large number of assays can be carried out simultaneously, using small amounts of reagents 

30 and samples. The particular manner of binding of the composition is typically not crucial so 
long as it is compatible with the reagents and overall methods of the invention, maintains the 
activity of the composition, and is nondiffiisable. Preferred methods of binding include the 
use of antibodies (which do not sterically block either the ligand binding site or activation 
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sequence when the protein is bound to the support), direct binding to "sticky*' or ionic 
supports, chemical crosslinking, the synthesis of the protein or agent on the surfece, etc. 
Following binding of the protein or agent, excess unbound material is removed by washing. 
The sample receiving areas may then be blocked through incubation with bovine serimi 
5 albumin (BSA), casein or other innocuous protein or other moiety. 

In a preferred embodiment, the lung cancer protein is bound to the support, and a test 
compoimd is added to the assay. Alternatively, the candidate agent is bound to the support 
and the lung cancer protein is added. Novel binding agents include specific antibodies, non- 
natural binding agents identified in screens of chemical libraries, peptide analogs, etc. 'Of 

10 particular interest are screening assays for agents that have a low toxicity for human cells. A 
wide variety of assays may be used for this purpose, including labeled in vitro protein-protein 
binding assays, electrophoretic mobility shift assays, immunoassays for protein binding, 
functional assays (phosphorylation assays, etc.) and the like. 

The determination of the binding of the test modulating compound to the lung cancer 

1 5 protein may be done in a number of ways. In a preferred embodiment, the compound is 

labeled, and binding determined directly, e.g., by attaching all or a portion of the Ixmg cancer 
protein to a solid support, adding a labeled candidate agent (e.g., a fluorescent label), washing 
off excess reagent, and determining whether the label is present on the solid support. Various 
blocking and washing steps may be utilized as appropriate, 

20 In some embodiments, only one of the components is labeled, e.g., the proteins (or 

proteinaceous candidate compounds) can be labeled. Alternatively, more than one 
component can be labeled with different labels, e.g., ^^I for the proteins and a fluorophor for 
the compound. Proximity reagents, e.g., quenching or energy transfer reagents are also 
usefiil. 

25 In one embodiment, the binding of the test compound is determined by competitive 

binding assay. The competitor may be a binding moiety known to bind to the target molecule 
(i.e., a lung cancer protein), such as an antibody, peptide, binding partner, ligand, etc. Under 
certain circumstances, there may be competitive binding between the compound and the 
binding moiety, with the binding moiety displacing the compound. In one embodiment, the 

30 test compound is labeled. Either the compoimd, or the competitor, or both, is added first to 
the protein for a time sufficient to allow binding, if present. Incubations may be performed at 
a temperature which facilitates optimal activity, typically between 4 and 40° C. Incubation 
periods are typically optimized, e.g., to facilitate rapid high throughput screening. Typically 
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between 0. 1 and 1 hour will be sufiBcient. Excess reagent is generally removed or washed 

away. The second component is then added, and the presence or absence of the labeled 

component is followed, to indicate binding. 

In a preferred embodiment, the competitor is added first, followed by a test 

5 compound Displacement of the competitor is an indication that the test compound is binding 

to the lung cancer protein and thus is capable of binding to, and potentially modulating, the 

activity of the lung cancer protein. In this embodiment, either component can be labeled 

Thus, e,g., if the competitor is labeled, the presence of label in the wash solution indicates 

displacement by the agent. Alternatively, if the test compound is labeled, the presence of the 

10 label on the support indicates displacement. 

In an alternative embodiment, the test compoimd is added first, with incubation and 
washing, followed by the competitor. The absence of binding by the competitor may indicate 
that the test compound is bound to the lung cancer protein with a higher affinity. Thus, if the 
test compound is labeled, the presence of the label on the support, coupled with a lack of • 

15 competitor binding, may mdicate that the test compound is capable of binding to the lung 
cancer protein. 

In a preferred embodiment, the methods comprise differential screening to identity 
agents that are capable of modulating the activity of the lung cancer proteins. In one 
embodiment, the methods comprise combining a lung cancer protein and a competitor in a 

20 first sample. A second sample comprises a test compound, a lung cancer protein, and a 

competitor. The binding of the competitor is determined for both samples, and a change, or 
difference in binding between the two samples indicates the presence of an agent capable of 
binding to the limg cancer protem and potentially modulating its activity. That is, if the 
binding of the competitor is different in the second sample relative to the first sample, the 

25 agent is capable of binding to the lung cancer protein. 

Alternatively, differential screening is used to identify drug candidates that bind to the 
native lung cancer protein, but cannot bind to modified lung cancer protems. The stmcture of 
the lung cancer protein may be modeled, and used in rational drug design to synthesize agents 
that interact with that site. Drug candidates that affect the activity of a limg cancer protein 

30 are also identified by screening drugs for the ability to either enhance or reduce the activity of 
the protein. 

Positive controls and negative controls may be used in the assays. Preferably control 
and test samples are performed in at least triplicate to obtain statistically significant results. 
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Incubation of all samples is for a time sufficient for the binding of the agent to the protein- 
Following incubation, samples are washed free of non-specifically bound material and the 
amoimt of bound, generally labeled agent detennined. For example, where a radiolabel is 
employed, the samples may be counted in a scintillation counter to detemiine the amount of 
5 bound compound. 

A variety of other reagents may be included in the screening assays. These include 
reagents like salts, neutral proteins, e.g., albumin, detergents, etc. which may be used to 
facilitate optimal protein-protein binding and/or reduce non-specific or background 
interactions. Also reagents that otherwise improve the efficiency of the assay, such as 
10 protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used. The mixture 
of components may be added in an order that provides for the requisite binding. 

In a preferred embodiment, the invention provides methods for screening for a 
compound capable of modulating the activity of a limg cancer protein. The methods 
comprise adding a test compound, as defined above, to a cell comprising lung cancer 
15 proteins. Preferred cell types include ahnost any cell. The cells contain a recombinant 
nucleic acid that encodes a lung cancer protein. In a preferred embodiment, a library of 
candidate agents are tested on a plurality of cells. 

In one aspect, the assays are evaluated in the presence or absence or previous or 
subsequent exposure of physiological signals, e.g., hormones, antibodies, peptides, antigens, 
20 cytokines, growth factors, action potentials, pharmacological agents including 

chemotherapeutics, radiation, carcinogenics, or other cells (e.g., cell-cell contacts). In another 
example, the determinations are determined at different stages of the cell cycle process. 

In this way, compounds that modulate lung cancer agents are identified. Compounds 
with pharmacological activity are able to enhance or interfere with the activity of the lung 
25 cancer protein. Once identified, similar structures are evaluated to identify critical structural 
feature of the compoimd 

In one embodiment, a method of inhibitmg lung cancer cell division is provided. The 
method comprises adnainistration of a lung cancer inhibitor. In another embodiment, a 
method of inhibiting lung cancer is provided. The method may comprise administration of a 
30 lung cancer inhibitor. In a fiirther embodiment, methods of treating cells or individuals with 
lung cancer are provided, e.g., comprising administration of a lung cancer inhibitor. 

In one embodiment, a lung cancer inhibitor is an antibody as discussed above. In 
another embodiment, the lung cancer inhibitor is an antisense molecule. 



67 



wo 02/086443 PCT/US02/12476 

A variety of cell growth, proliferation, viability, and metastasis assays are known to 
those of skill in the art, as described below. 

Soft agar growth or colony formation in suspension 
5 Normal cells require a solid substrate to attach and grow. When the cells are 

transformed, they lose this phenotype and grow detached from the substrate. For example, 
transformed cells can grow in stirred suspension culture or suspended in semi-soUd media, 
such as semi-solid or soft agar. The transformed cells, when transfected with tumor 
suppressor genes, regenerate normal phenotype and require a solid substrate to attach and 

10 grow. Soft agar growth or colony formation in suspension assays can be used to identify 
modulators of lung cancer sequences, which when expressed in host cells, inhibit abnormal 
cellular proliferation and transformation. A therapeutic compound would reduce or eliminate 
the host cells' ability to grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft. 

15 Techniques for soft agar growth or colony formation in suspension assays are 

described in Freshney (1994) Culture of Animal Cells a Manual of Basic Technique (3^^ ed.), 
herein incorporated by reference. See also, the methods section of Garkavtsev, et al (1996), 
supra, herein incorporated by reference. 

20 Contact inhibition and density limitation of growth 

Normal cells typically grow in a flat and organized pattern in a petri dish until they 
touch other cells. When the cells touch one another, they are contact inhibited and stop 
growing. When cells are transformed, however, the cells are not contact inhibited and 
continue to grow to high densities in disorganized foci. Thus, the transformed cells grow to a 

25 higher saturation density than normal cells. This can be detected morphologically by the 
formation of a disoriented monolayer of cells or rounded cells in foci within the regular 
pattern of normal surrounding cells. Alternatively, labeling index with (^H)-thymidine at 
saturation density can be used to measure density Umitation of growth. See Freshney (1994), 
supra. The transformed cells, when transfected with tumor suppressor genes, regenerate a 

30 normal phenotype and become contact inhibited and would grow to a lower density. 

In this assay, labeling index with (^H)-thymidine at saturation density is a preferred 
method of measuring density limitation of growth. Transformed host cells are transfected 
with a lung cancer-associated sequence and are grown for 24 hours at saturation density in 
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non-limiting medium conditions. The percentage of cells labeling with (^H)-thymidine is 

detennined autoradiographically. See, Freshney (1994), supra. 

Growth factor or serum dependence 
5 Transformed cells typically have a lower senmi dependence than their normal 

counterparts (see, e.g., Temin (1966) J. Natl. Cancer Insti. 37:167-175; Eagle, et al. (1970) L 
Exp. Med. 131:836-879); Freshney, supra. This is in part due to release of various growth 
factors by the transformed cells. Growth factor or serum dependence of transformed host 
cells can be compared with that of control. 

10 

Tumor specific markers levels 

Tumor cells release an increased amoxmt of certain factors (heremafter *tumor 
specific markers") than their normal counterparts. For example, plasminogen activator (PA) 
is released from human glioma at a higher level than from normal brain cells (see, e.g., 

15 Gullino, "Angiogenesis, tumor vascularization, and potential interference with tumor growth" 
in Mihich (ed. 1985) Biological Responses in Cancer, pp. 178-184). Similarly, Tumor 
angiogenesis factor (TAF) is released at a higher level in tumor cells than their normal 
counterparts. See, e.g., Folkman (1992) "Angiogenesis and Cancer" in Sem Cancer BioU . 
Various techniques which measure the release of these factors are described in 

20 Freshney (1994), supra. Also, see, Unkeless, et al. (1974) J. Biol. Chem. 249:4295-4305; 
Strickland and Beers (1976) J. Biol. Chem . 251:5694-5702; Whur, et al. (1980) Br. J. Cancer 
42:305-312; Gullino, "Angiogenesis, tumor vascularization, and potential interference with 
tumor growth" in Mihich (ed. 1985) Biological Responses in Cancer, pp. 178-184; Freshney 
Anticancer Res. 5:1 1 1-130 (1985). 

25 

Invasiveness into Matrigel 

The degree of invasiveness into Matrigel or some other extracellular matrix 
constituent can be used as an assay to identify compounds that modulate lung cancer- 
associated sequences. Txunor cells exhibit a good correlation between malignancy and 
30 invasiveness of cells into Matrigel or some other extracellular matrix constituent. In this 
assay, tumorigenic cells are typically used as host cells. Expression of a tumor suppressor 
gene in these host cells would decrease invasiveness of the host cells. 
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Techniques described in Freshney (1994), supra, can be used. Briefly, the level of 

invasion of host cells can be measured by using filters coated with Matrigel or some other 

extracellular matrix constituent Penetration into the gel, or through to the distal side of the 

filter, is rated as mvasiveness, and rated histologically by number of cells and distance 

5 moved, or by prelabeling the cells with ^^I and counting the radioactivity on the distal side of 

the filter or bottom of the dish. See, e.g., Freshney (1984), supra. 



Tiunor growth in vivo 

Effects of lung cancer-associated sequences on cell growth can be tested in transgenic 

10 or immune-suppressed mice. Knock-out transgenic mice can be made, in which the lung 
cancer gene is disrupted or in which a lung cancer gene is mserted. Knock-out transgenic 
mice can be made by insertion of a marker gene or other heterologous gene into the 
endogenous lung cancer gene site in the mouse genome via homologous recombination. 
Such mice can also be made by substituting the endogenous lung cancer gene with a mutated 

15 version of the lung cancer gene, or by mutating the endogenous lung cancer gene, e.g., by 
exposure to carcinogens. 

A DNA construct is introduced into the nuclei of embryonic stem cells. Cells 
containing the newly engineered genetic lesion are injected into a host mouse embryo, which 
is re-implanted into a recipient female. Some of these embryos develop into chimeric mice 

20 that possess germ cells partially derived fi-om the mutant cell line. Therefore, by breeding the 
chimeric mice it is possible to obtain a new line of mice containing the introduced genetic 
lesion (see, e.g., Capecchi, et al. (1989) Science 244: 1288). Chimeric targeted mice can be 
derived according to Hogan, et al. (1988) Manipulating the Mouse Embryo: A Laboratorv 
Manual Cold Spring Harbor Laboratory and Robertson (ed. 1987) Teratocarcin omas and 

25 Embrvonic Stem Cells: A Practical Approach. , IRL Press, Washington, D.C. 

Alternatively, various immune-suppressed or immune-deficient host animals can be 
used. For example, genetically athymic "nude" mouse (see, e.g., Giovanella, et al. (1974) J. 
Natl. Cancer Inst. 52:921), a SCID mouse, a thymectomized mouse, or an irradiated mouse 
(see, e.g., Bradley, et al. (1978) Br J. Cancer 38:263; Selby, et al. (1980) Br. J. Cancer 41:52) 

30 can be used as a host. Transplantable tumor cells (typically about 10^ cells) injected into 

isogenic hosts will produce invasive tumors in a high proportions of cases, while normal cells 
of similar origin will not. In hosts which developed invasive tumors, cells expressing a lung 
cancer-associated sequences are injected subcutaneously. After a suitable length of time. 
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preferably 4-8 weeks, tumor growth is measured (e.g., by volume or by its two largest 
dimensions) and compared to the control. Tumors that have statistically significant reduction 
(using, e.g., Student's T test) are said to have inhibited growth. 

5 Polynucleotide modulators of lung cancer 

Antisense and RNAi Polynucleotides 

In certain embodiments, the activity of a lung cancer-associated protein is 
downregulated, or entirely inhibited, by the use of antisense or an inhibitory polynucleotide, 
i.e., a nucleic acid complementary to, and which can preferably hybridize specifically to, a 

10 coding mRNA nucleic acid sequence, e.g., a lung cancer protein mRNA, or a subsequence 
thereof Binding of the antisense polynucleotide to the mRNA reduces the translation and/or 
stability of the mRNA. 

In the context of this invention, antisense polynucleotides can comprise naturally- 
occurring nucleotides, or synthetic species formed fi-om naturally-occurring subunits or their 

15 close homologs. Antisense polynucleotides may also have altered sugar moieties or inter- 
sugar linkages. Exemplary among these are the phosphorothioate and other sulfiu: contaming 
species which are known for use in the art Analogs are comprehended by this invention so 
long as they function effectively to hybridize with the lung cancer protein mRNA. See, e.g., 
Isis Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA. 

20 Such antisense polynucleotides can readily be synthesized using recombinant means, 

or can be synthesized in vitro. Equipment for such synthesis is sold by several vendors, 
including Applied Biosystems. The preparation pf other oligonucleotides such as 
phosphorothioates and alkylated derivatives is also well known to those of skill in the art. 
Antisense molecules as used herein include antisense or sense oligonucleotides. 

25 Sense oligonucleotides can, e.g., be employed to block transcription by binding to the anti- 
sense strand. The antisense and sense oligonucleotide comprise a single-stranded nucleic 
acid sequence (either RNA or DNA) capable of binding to target mRNA (sense) or DNA 
(antisense) sequences for lung cancer molecules. A preferred antisense molecule is for a lung 
cancer sequence in the tables, or for a ligand or activator thereof Antisense or sense 

30 oligonucleotides, according to the present invention, comprise a firagment generally at least 
about 14 nucleotides, preferably from about 14 to 30 nucleotides. The ability to derive an 
antisense or a sense oligonucleotide, based upon a cDNA sequence encoding a given protein 
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is described in, e.g.. Stein and Cohen (1988) Cancer Res. 48:2659 and van der Krol, et al. 

(1988) BioTechniaues 6:958). 

RNA interference is a mechanism to suppress gene expression in a sequence specific 

manner. See, e.g., Brumelkamp, et al. r2QQ2^ Sciencexpress (21March2002); Sharp (1999) 

5 Genes Dev. 13:139-141; and Cathew (2001) Curr. Qd. Cell Biol. 13:244-248. In mammalian 

cells, short, e.g., 21 nt, double stranded small interfering RNAs (siRNA) have been shown to 

be effective at inducmg an RNAi response. See, e.g., Elbashir, et al. (2001) Nature 41 1 :494- 

498. The mechanism may be used to downregulate expression levels of identified genes, e.g., 

treatment of or validation of relevance to disease. 

10 

Ribozymes 

In addition to antisense polynucleotides, ribozymes can be used to target and inhibit 
transcription of lung cancer-associated nucleotide sequences. A ribozyme is an RNA 
molecule that catalytically cleaves other RNA molecules. Different kinds of ribozymes have 

15 been described, including group I ribozymes, hammerhead ribozymes, hairpin ribozymes, 
RNase P, and axhead ribozymes (see, e.g., Castanotto, et al. (1994) Adv. in Pharmacology 
25: 289-317 for a general review of the properties of different ribozymes). 

The general features of hairpin ribozymes are described, e.g., in Hampel, et al. (1990) 
Nucl. Acids Res. 18:299-304; European Patent PubUcation No. 0 360 257; U.S. Patent No. 

20 5,254,678. Methods of preparing are well known to those of skill in the art (see, e.g., WO 
94/26877; Ojwang, et al. (1993) Proc. Natl. Acad. Sci. USA 90:6340-6344; Yamada, et al. 
(1994) Human Gene Therapv 1:39-45; Leavitt, et al. (1995) Proc. Natl. Acad. Sci. USA 
92:699-703; Leavitt, et al. n9994;^ Human Gene Therapv 5:1151-120; and Yamada, et al. 
(1994) Virology 205: 121-126). 

25 Polynucleotide modulators of lung cancer may be introduced into a cell containing the 

target nucleotide sequence by formation of a conjugate with a ligand binding molecule, as 
described in WO 91/04753. Suitable ligand binding molecules include, but are not limited to, 
cell surface receptors, growth factors, other cytokines, or other ligands that bind to cell 
surface receptors. Preferably, conjugation of the ligand binding molecule does not 

30 substantially interfere with the ability of the ligand binding molecule to bind to its 

corresponding molecule or receptor, or block entry of the sense or antisense oligonucleotide 
or its conjugated version into the cell. Alternatively, a polynucleotide modulator of lung 
cancer may be introduced into a cell containing the target nucleic acid sequence, e.g., by 
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fonnation of an polynucleotide-lipid complex, as described in WO 90/10448. It is 
understood that the use of antisense molecules or knock out and knock in models may also be 
used in screening assays as discussed above, in addition to methods of treatment. 

Thus, in one embodiment, methods of modulating lung cancer in cells or organisms 
5 are provided. In one embodiment, the methods comprise administering to a cell an anti-lung 
cancer antibody that reduces or eliminates the biological activity of an endogenous lung 
cancer protein. Alternatively, the methods comprise administering to a cell or organism a 
recombinant nucleic acid encoding a lung cancer protein. This may be accomplished in any 
number of ways. In a preferred embodiment, e.g., when the limg cancer sequence is down- 

10 regulated in lung cancer, such state may be reversed by increasmg the amount of lung cancer 
gene product in the cell. This can be accompUshed, e.g., by overexpressing the endogenous 
lung cancer gene or administering a gene encoding the lung cancer sequence, using known 
gene-therapy techniques. In a preferred embodiment, the gene therapy techniques include the 
incorporation of the exogenous gene using enhanced homologous recombination (EHR), e.g., 

15 as described in PCT/US93/03868, hereby incorporated by reference in its entirety. 

Alternatively, e.g., when the lung cancer sequence is up-regulated in lung cancer, the activity 
of the endogenous limg cancer gene is decreased, e.g., by the administration of a lung cancer • 
antisense or RNAi nucleic acid. 

In one embodiment, the lung cancer proteins of the present invention may be used to 

20 generate polyclonal and monoclonal antibodies to lung cancer proteins. Similarly, the lung 
cancer proteins can be coupled, using standard technology, to afiOnity chromatography 
columns. These columns may then be used to purify lung cancer antibodies useful for 
production, diagnostic, or therapeutic purposes. In a preferred embodiment, the antibodies 
are generated to epitopes unique to a lung cancer protein; that is, the antibodies show Uttle or 

25 no cross-reactivity to other proteins. The lung cancer antibodies may be coupled to standard 
affinity chromatography columns and used to purify lung cancer proteins. The antibodies 
may also be used as blocking polypeptides, as outlined above, since they will specifically 
bind to the lung cancer protein. 



30 Methods of identifying variant lung cancer-associated sequences 

Without being bound by theory, expression of various lung cancer sequences is 
correlated with lung cancer. Accordingly, disorders based on mutant or variant limg cancer 
genes may be determined. In one embodiment, the invention provides methods for 
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identifying cells containing variant lung cancer genes, e.g., dete rminin g all or part of flie 

sequence of at least one endogenous lung cancer genes in a cell. In a preferred embodiment, 

the invention provides methods of identifying the lung cancer genotype of an individual, e.g., 

determining all or part of the sequence of at least one lung cancer gene of the individual. 

5 This is generally done in at least one tissue of the individual, and may include the evaluation 

of a number of tissues or different samples of the same tissue. The method may include 

comparing the sequence of the sequenced lung cancer gene to a known lung cancer gene, i.e., 

a wild-type gene. 

The sequence of all or part of the lung cancer gene can then be compared to the 
10 sequence of a known lung cancer gene to deteraiine if any differences exist. This can be 
done using known homology programs, such as Bestfit, etc. In a preferred embodiment, the 
presence of a difference in the sequence between the lung cancer gene of the patient and the 
known lung cancer gene correlates with a disease state or a propensity for a disease state, as 
outlined herein. 

15 In a preferred embodiment, the lung cancer genes are used as probes to determine the 

number of copies of the limg cancer gene in the genome. 

In another preferred embodiment, the lung cancer genes are used as probes to 

determine the chromosomal localization of the lung cancer genes. Information such as 

chromosomal localization finds use in providing a diagnosis or prognosis in particular when 
20 chromosomal abnormalities such as translocations, and the like are identified in the lung 

cancer gene locus. 

Administration of pharmaceutical and vaccine compositions 

In one embodiment, a therapeutically effective dose of a lung cancer protein or 
25 modulator thereof, is administered to a patient. By "therapeutically effective dose" herein is 
meant a dose that produces effects for which it is administered. The exact dose will depend 
on the purpose of the treatment, and will be ascertainable by one skilled in the art using 
known techniques (e.g., Ansel, et al. (1992) Pharmaceutical Dosage Forms and Drug 
Deliverv: Lieberman, Pharmaceutical Dosage Forms (vols. 1-3), Dekker, ISBN 0824770846, 
30 082476918X, 0824712692, 0824716981; Lloyd (1999) The Art. Science and Technologv of 
Pharmaceutical Compounding : and Pickar (1999) Dosage CalculationsV Adjustments for 
lung cancer degradation, systemic versus localized delivery, and rate of new protease 
synthesis, as well as the age, body weight, general health, sex, diet, time of administration. 
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drug interaction and the severity of the condition may be necessary, and will be ascertainable 

with routine experimentation by those skilled in the art 

A "patient" for the purposes of the present invention includes both humans and other 

animals, particularly mammals. Thus the methods are applicable to both human therapy and 

5 veterinary appUcations. In the preferred embodiment the patient is a mammal, preferably a 

primate, and in the most preferred embodiment the patient is human. 

The administration of the lung cancer proteins and modulators thereof of the present 

invention can be done in a variety of ways, including, but not limited to, orally, 

subcutaneously, intravenously, intranasally, transdermally, intraperitoneally, intramuscularly, 

10 intrapulmonary, vaginally, rectally, or intraocularly. In some instances, e.g., in the treatment 
of wounds and inflammation, the lung cancer proteins and modulators may be directly 
applied as a solution or spray. 

The pharmaceutical compositions of the present invention comprise a lung cancer 
protein in a form suitable for administration to a patient In the preferred embodiment, the 

15 pharmaceutical compositions are in a water soluble form, such as being present as 

pharmaceutically acceptable salts, which is meant to include both acid and base addition 
salts. *Tharmaceutically acceptable acid addition salt" refers to those salts that retain the 
biological effectiveness of the free bases and that are not biologically or otherwise 
undesirable, formed with inorganic acids such as hydrochloric acid, hydrobromic acid, 

20 sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, 
propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic acid, succinic 
acid, fumaric acid, tartaric acid, citric acid, benzoic acid, cinnamic acid, mandelic acid, 
methanesulfonic acid, ethanesulfonic acid, p-toluenesulfonic acid, saUcylic acid and the like. 
'TharaiaceuticaUy acceptable base addition salts" include those derived from inorganic bases 

25 such as sodium, potassium, lithium, ammonium, calcium, magnesium, iron, zinc, copper, 
manganese, aluminum salts and the like. Particularly preferred are the ammonium, 
potassium, sodium, calcium, and magnesium salts. Salts derived from pharmaceutically 
acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, 
substituted amines including naturally occiuring substituted amines, cyclic amines and basic 

30 ion exchange resins, such as isopropylamine, trimethylamine, diethylamine, triethylamine, 
tripropylamine, and ethanolamine. 

The pharmaceutical compositions may also include one or more of the following: 
carrier proteins such as serum albumin; buffers; fillers such as microcrystalline cellulose. 
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lactose, com and other starches; binding agents; sweeteners and other flavoring agents; 

coloring agents; and polyethylene glycol 

The pharmaceutical compositions can be administered in a variety of unit dosage 

forms depending upon the method of administration. For example, unit dosage forms 

5 suitable for oral administration include, but are not limited to, powder, tablets, pills, capsules 

and lozenges. It is recognized that lung cancer protem modulators (e.g., antibodies, antisense 

constructs, ribozymes, small organic molecules, etc.) when administered orally, should be 

protected from digestion. This is typically accomplished either by complexing the 

molecule(s) with a composition to render it resistant to acidic and enzymatic hydrolysis, or by 

10 packaging the molecule(s) in an appropriately resistant carrier, such as a liposome or a 
protection barrier. Means of protecting agents from digestion are well known in the art 

The compositions for administration will commonly comprise a lung cancer protein 
modulator dissolved in a pharmaceutically acceptable carrier, preferably an aqueous carrier. 
A variety of aqueous carriers can be used, e.g., buffered saline and the like. These solutions 

15 are sterile and generally free of undesirable matter. These compositions may be sterilized by 
conventional, well known steriUzation techniques. The compositions may contain 
pharmaceutically acceptable auxiliary substances as required to approximate physiological 
conditions such as pH adjusting and buffering agents, toxicity adjusting agents and the like, 
e.g., sodium acetate, sodium chloride, potassium chloride, calcium chloride, sodium lactate 

20 and the hke. The concentration of active agent in these formulations can vary widely, and 
will be selected primarily based on fluid volumes, viscosities, body weight and the like in 
accordance with the particular mode of administration selected and the patient's needs (e.g., 
Remington's Pharmaceutical Science (15th ed., 1980) and Hardman, et al. (eds. 1996) 
Goodman and Oilman: The Pharmacologial Basis of TheraneuticsV 

25 Thus, a typical pharmaceutical composition for intravenous administration would be 

about 0. 1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per patient per 
day may be used, particularly when the drug is administered to a secluded site and not into 
the blood stream, such as into a body cavity or into a limien of an organ. Substantially higher 
dosages are possible in topical administration. Actual methods for preparing parenterally 

30 administrable compositions will be known or apparent to those skilled in the art, e.g.. 

Remington's Pharmaceutical Science and Goodman and Gilman, The Pharm acolo^al Basis 
of Therapeutics, supra. 



76 



wo 02/086443 PCT/US02/12476 
The compositions containing modulators of lung cancer proteins can be administered 

for therapeutic or prophylactic treatments. In ther^eutic applications, compositions are 

administered to a patient suffering from a disease (e.g., a cancer) in an amount sufficient to 

cure or at least partially arrest the disease and its con^)lications. An amount adeqxiate to 

5 accomplish this is defined as a "therapeutically effective dose." Amounts effective for this 

use will depend upon the severity of the disease and the general state of the patient's health. 

Single or multiple administrations of the compositions may be administered depending on the 

dosage and frequency as required and tolerated by the patient. In any event, the composition 

should provide a sufficient quantity of the agents of this invention to effectively treat the 

10 patient. An amount of modulator that is capable of preventing or slowing the development of 
cancer in a mammal is referred to as a "prophylactically effective dose." The particular dose 
required for a prophylactic treatment will depend upon the medical condition and history of 
the mammal, the particular cancer being prevented, as well as other factors such as age, 
weight, gender, administration route, efficiency, etc. Such prophylactic treatments may be 

15 used, e.g., in a mammal who has previously had cancer to prevent a recurrence of the cancer, 
or in a mammal who is suspected of having a significant likelihood of developing cancer 
based, at least in part, upon gene expression profiles. Vaccine strategies may be used, in 
either a DNA vaccine form, or protein vaccine. 

It will be appreciated that the present lung cancer protein-modulating compounds can 

20 be administered alone or in combination with additional lung cancer modulating compounds 
or with other therapeutic agent, e.g., other anti-cancer agents or treatments. 

In numerous embodiments, one or more nucleic acids, e.g., polynucleotides 
comprising nucleic acid sequences set forth in the tables, such as antisense or RNAi 
polynucleotides or ribozjmies, will be introduced into cells, in vitro or in vivo. The present 

25 invention provides methods, reagents, vectors, and cells usefiil for expression of lung cancer- 
associated polypeptides and nucleic acids using in vitro (cell-free), ex vivOy or in vivo (cell or 
organism-based) recombinant expression systems. 

The particular procedure used to introduce the nucleic acids into a host cell for 
expression of a protein or nucleic acid is application specific. Many procedures for 

30 introducing foreign nucleotide sequences into host cells may be used. These include the use 
of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, 
plasma vectors, viral vectors and other well known methods for introducing cloned genomic 
DNA, cDNA, syntiietic DNA or other foreign genetic material into a host cell (see, e.g.. 
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Berger and Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology 

volume 152 (Berger), Ausubel, et al. (eds. 1999) Current Protocols (supplemented through 

1999), and Sambrook, et al. (1989) Molecular Cloning - A Laboratory Manual (2nd ed.. Vol. 

1-3). 

5 In a preferred embodiment, limg cancer proteins and modulators are administered as 

therapeutic agents, and can be formulated as outlined above. Similarly, lung cancer genes 
(including both the full-length sequence, partial sequences, or regulatory sequences of the 
lung cancer coding regions) can be administered in a gene therapy application. These lung 
cancer genes can include antisense or inhibitory applications, e.g., as inhibitory RNA or gene 

10 therapy (e.g., for incorporation into the genome) or as antisense compositions. 

Lung cancer polypeptides and polynucleotides can also be administered as vaccine 
compositions to stimulate HTL, CTL, and antibody responses.. Such vaccine compositions 
can include, e.g., lipidated peptides (see, e.g.,Vitiello, et al. (1995) J. Clin. Invest. 95:341), 
peptide compositions enc^sulated in poly(DL-Iactide-co-glycolide) ('TLG") microspheres 

15 (see, e.g., Eldridge, et al. (1991) Molec. Immunol. 28:287-294; Alonso, et al. (1994) Vaccine 
12:299-306; Jones, et al. (1995) Vaccine 13:675-681), peptide compositions contained in 
immune stimulating complexes (ISCOMS) (see, e.g., Takahashi, et al. (1990) Nature 
344:873-875; Hu, et al. (1998) Clin Exp Immunol. 113:235-243), multiple antigen peptide 
systems (MAPs) (see, e.g.. Tarn (1988) Proc. Natl. Acad. Sci. U.S.A. 85:5409-5413; Tarn 

20 (1996) J. Immunol. Methods 196:17-32), peptides foimulated as multivalent peptides; 

peptides for use in ballistic delivery systems, typically crystallized peptides, viral delivery 
vectors (Perkus, et al, p. 379 In: Kaufinann (ed. 1996) Concepts in vaccine development: 
Chakrabarti, et al. (1986) Nature 320:535; Hu, et al. (1986) Nature 320:537; Kieny, et al. 
(1986) AIDS Bio/Technologv 4:790; Top, et al. (1971) J. Infect. Pis. 124:148; Chanda, et al. 

25 (1990) Virology 175:535), particles of viral or synthetic origin (see, e.g., Kofler, et al. (1996) 
J. Immunol. Methods 192:25; Eldridge, et al. (1993) Sem. Hematol. 30:16; Falo, et al. (1995) 
Nature Med. 7:649), adjuvants (Warren, et al. (1986) Annu. Rev. Immunol. 4:369; Gupta, et 
al. (1993) Vaccine 1 1:293), liposomes (Reddy, et al. (1992) J. Immunol. 148:1585; Rock 
(1996) Immunol. Todav 17:131), or, naked or particle absorbed cDNA (Uhner, et al. (1993) 

30 Science 259:1745; Robinson, et al. (1993) Vaccine 1 1:957; Shiver, et al., p. 423 In: 

Kaufinaim (ed. 1996) Concepts in vaccine development : Cease and Berzofsky (1994) Annu. 
Rev. Immunol. 12:923 and Eldridge, et aL (1993) Sem. Hematol. 30:16). Toxin-targeted 
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delivery technologies, also known as receptor mediated targeting, such as tiiose of Avant 
Immunothen^eutics, Inc. (Needham, Massachusetts) may also be used. 

Vaccine compositions often include adjuvants. Many adjuvants contain a substance 
designed to protect the antigen from rapid catabolism, such as aluminimd hydroxide or 
5 mineral oil, and a stimulator of immune responses, such as lipid A, Bortadella pertussis or 
Mycobacterium tuberculosis derived proteins. Certain adjuvants are commercially available 
as, e.g., Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, Detroit, 
MI); Merck Adjuvant 65 (Merck and Company, Inc., Rahway, NJ); AS-2 (SmithKhne 
Beecham, Philadelphia, PA); alumiaum salts such as aluminum hydroxide gel (alum) or 
10 aluminum phosphate; salts of calcium, iron or zmc; an insoluble suspension of acylated 
tyrosine; acylated sugars; cationically or anionically derivatized polysaccharides; 
polyphosphazenes; biodegradable microspheres; monophosphoryl lipid A and quil A. 
Cytokines, such as GM-CSF, interleukin-2, -7, -12, and other like growth factors, may also be 
used as adjuvants. 

15 Vaccines can be administered as nucleic acid compositions wherein DNA or RNA 

encodmg one or more of the polypeptides, or a fragment thereof, is administered to a patient. 
This approach is described, for instance, in Wolff, et. al. (1990) Science 247:1465 as well as 
U.S. Patent Nos. 5,580,859; 5,589,466; 5,804,566; 5,739,118; 5,736,524; 5,679,647; WO 
98/04720; and in more detail below. Examples of DNA-based delivery technologies include 

20 **naked DNA", facilitated (bupivicaine, polymers, peptide-mediated) delivery, cationic Hpid 
complexes, and particle-mediated ("gene gun") or pressure-mediated delivery (see, e.g., U.S. 
Patent No. 5,922,687). 

For therapeutic or prophylactic immxmization purposes, the peptides of the invention 
can be expressed by viral or bacterial vectors. Examples of expression vectors include 

25 attenuated viral hosts, such as vaccinia or fowlpox. This ^proach involves the use of 
vaccinia virus, e.g., as a vector to express nucleotide sequences that encode lung cancer 
polypeptides or polypeptide fragments. Upon introduction into a host, the recombinant 
vaccinia virus expresses the immimogenic peptide, and thereby elicits an immune response. 
Vaccinia vectors and methods useful in immunization protocols are described in, e.g., U.S. 

30 Patent No. 4,722,848. Another vector is BCG (Bacille Cabnette Guerin). BCG vectors are 
described in Stover, et al. (1991) Nature 35 1 :456-460. A wide variety of other vectors useful 
for therapeutic administration or immunization e.g., adeno and adeno-associated virus 
vectors, retroviral vectors. Salmonella typhi vectors, detoxified anthrax toxin vectors, and the 
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like, will be apparent to those skilled in the art from the description herein (see, e.g., Shata, et 
al. (2000) Mol Med Today 6:66-71; Shedlock, et al. (2000) J. Leukoc. Biol. 68:793-806; 
Hipp, et al. (2000) In Vivo 14:571-85). 

Methods for the use of genes as DNA vaccines are well known, and include placing a 
5 lung cancer gene or portion of a lung cancer gene under the control of a regulatable promoter 
or a tissue-specific promoter for expression in a lung cancer patient. The lung cancer gene 
used for DNA vaccines can encode fiiU-length lung cancer proteins, but more preferably 
encodes portions of the lung cancer proteins including peptides derived from the lung cancer 
protein. In one embodiment, a patient is immunized with a DNA vaccine comprising a 

10 pluraUty of nucleotide sequences derived from a lung cancer gene. For example, lung cancer- 
associated genes or sequence encoding subfragments of a lung cancer protein are introduced 
into expression vectors and tested for their immunogenicity in the context of Class I MHC 
and an abiUty to generate cytotoxic T cell responses. This procedure provides for production 
of cytotoxic T cell responses against cells which present antigen, including intracellular 

15 epitopes. 

In a preferred embodiment, DNA vaccines include a gene encoding an adjuvant 
molecule with the DNA vaccine. Such adjuvant molecules include cytokines that increase 
the immunogenic response to the lung cancer polypeptide encoded by the DNA vaccine. 
Additional or alternative adjuvants are available. 

20 In another preferred embodiment lung cancer genes find use in generating animal 

models of lung cancer. When the lung cancer gene identified is repressed or diminished in 
metastatic tissue, gene therapy technology, e.g., wherem antisense or inhibitory RNA directed 
to the lung cancer gene will also diminish or repress expression of the gene. Animal models 
of lung cancer find use in screening for modulators of a lung cancer-associated sequence or 

25 modulators of lung cancer. Similarly, transgenic animal technology including gene knockout 
technology, e.g., as a result of homologous recombination with an appropriate gene targeting 
vector, will result in the absence or increased expression of the lung cancer protein. When 
desired, tissue-specific expression or knockout of the lung cancer protein may be necessary. 
It is also possible that the lung cancer protein is overexpressed in limg cancer. As 

30 such, transgenic animals can be generated that overexpress the lung cancer protein. 

Depending on the desired expression level, promoters of various strengths can be ^ployed 
to express the transgene. Also, the number of copies of the integrated transgene can be 
determined and compared for a determination of the expression level of the transgene. 
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Animals generated by such methods will find use as animal models of lung cancer and are 

additionally useful in screening for modulators to treat lung canc^. 



Kits for Use in Diagnostic and/or Prognostic Applications 

5 For use in diagnostic, research, and therapeutic applications suggested above, kits are 

also provided by the invention. In diagnostic and research applications such kits may include 
at least one of the following: assay reagents, buffers, lung cancer-specific nucleic acids or 
antibodies, hybridization probes and/or primers, antisense polynucleotides, ribozymes, RNAi, 
dominant negative lung cancer polypeptides or polynucleotides, small molecule inhibitors of 

10 lung cancer-associated sequences, etc. A therapeutic product may mclude sterile saline or 
another pharmaceutically acceptable emulsion and suspension base. 

In addition, flie kits may include instructional materials containing instructions (e.g., 
protocols) for the practice of the methods of this invention. While the mstructional materials 
typically comprise written or printed materials they are not limited to such. A medium 

15 capable of storing such instructions and communicating them to an end user is contemplated 
by this invention. Such media include, but are not limited to electronic storage media (e.g., 
magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the like. Such 
media may include addresses to internet sites that provide such instructional materials. 
The present invention also provides for kits for screening for modulators of lung 

20 cancer-associated sequences. Such kits can be prepared firom readily available materials and 
reagents. For example, such kits can comprise one or more of the following materials: a lung 
cancer-associated polypeptide or polynucleotide, reaction tubes, and instructions for testing 
lung cancer-associated activity. Optionally, the kit contains biologically active lung cancer 
protein. A wide variety of kits and components can be prepared according to the present 

25 invention, dependmg upon the intended user of the kit and the particular needs of the user. 
Diagnosis would typically involve evaluation of a plurality of genes or products. The genes 
typically will be selected based on correlations with important parameters in disease which 
may be identified in historical or outcome data. 
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Example 1: Gene Chip Analysis 

Molecular profiles of various normal and cancerous tissues were determined and 
5 analyzed using gene chips, RNA was isolated and gene chip analysis was performed as 

described (Glynne, et aL (2000) Nature 403:672-676; Zhao, et aL (2000) Genes Dev. 14:981- 
993). 
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Tables 1A and 1 B were previousfy filed on April 18, 2001 in USSN 601284,770 (18501401SOOUS} and on November 29, 2001 In USSN 60/334,370 
(18S01-001S20US) 
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101330 L43821 Hs.80261 enhancer of flamentagon 1 (cas^loe do 

IS 101336 L49169 Hs.75678 FBJ murine osteosarcoma viral oncogene h 

101345 L763S0 Hs. 1521 75 caidtonin ieoeptor4ite 

101678 Mfi2505 H&2161 complement component 5 receptor 1(CSa I 

101764 M30563 HS.81K6 S100caUunv{nndngprot^A4{c^)cknn 
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20 101842 M93221 Hs75182 mannose receptor; C type 1 

102283 U31384 Hs.83381 guanine QudeoSdebinfing protein 11 

102363 U39447 Ks.198241 amine oxidase: copper containing 3 (vase 
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25 103025 XS4131 Hs.1 23841 protein tyrosine phosphcdase; receptor t 
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118279 AA4e6073 Hs57362 ESTs 

117023 H88157 Ks.41105 ESTs 
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sohjte carrier family 35 (CMP-^ialic ad 
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1339B5 
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GRB2-assodated binding protein 2 


131678 


AI126821 
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13CS57 
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ESTs 




1^1355 
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Hs.17409 
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OKFZP434H204 piotan 




1305S2 


050402 
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ulutecanierfanily 11 (proiCRKcoupisd 




130^ 
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Hs.116774 


integrin. a^iha 1 


11.60 


130355 
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euksfyoSc franstaOon hcSaSon fsdor 
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AWS72422 


HS.153SS3 


MAO (rn^aias agaostdecapeiitaple^ 
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NJil.fl00323 


Hs.153514 


Fe&i£Ss pigmentosa GTPase legutstor 
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Hs.132390 


zinc fin^ proton 36 (KOX 18) 
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ESTs 
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vasoacfive tntesQn^ pepSde leceptor 1 


39.20 
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ESTs 
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ATP-blnding cassette, sub-family 6 (WHIT 
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AA305407 


Hs.102308 


potash oiwardty^ecGM channel, s 
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Hs.56340 


ESTs 
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Hs.186877 
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ESTs 
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127896 
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ESTs 
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ESTs 
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ESTs 
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ESTs 
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Hs.322430 
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X80031 


Hs.530 


collagen, type IV. alpha 3 (Goodpasture 
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ESTs 
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Hs.157392 


Hbmo sapiens cOl^ FU20760 fis. done CO 
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127535 


AA558424 


Hs.164450 


ESTs 


17.50 


127404 
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Hs.270224 


ESTs 


14.60 
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Hs.44896 


OnaJ (Hsp40) homotog. sutifamay mentbe 
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Hs.190272 


ESTs 
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CCAAT/enhancer famding protein (OEBP}. 
chroroogrann B (secreiogranto 1) 
E1A Unding pnMi p300 
KIAAD648 protein 

guanine nudeoSde exchange fiador ( 
repTcafion £actor C (ac&vator 1) 4 (37 
Koaio sapiens cONA: RJ22373fis.doneH 
karytfhenn aipha 3 (impor&i alpha 4) 
ESTs 

cydinlama-6a 

bone moiphogeneflc protein 7 (osteogenic 

carbonic anhydraseXU 

guanine monphosphate synthetase 

(^dc42 guamne exdiaoge {actor (GEF) 9 

ESTs. similar to T33468 hypolheti 

phosphoserine phosphatase ^ 

a disintegrin and mdaflopiotdnass donta 

HSKiyM protein 

twist (DrosophUa) homotog (acrocephalos 
SRB7 (suppressor of RNA polymerase B. ye 
hypoSiettcal protein aj 10074 
general transcripion factor DiA 
secretegranin 11 (chromogranin C) 
(fiscs, targe (DFKophila) homobg 5 
serine (or cysteine) proteinase iitao 
KiAA0203 gens product 
Ba2/adenovirus E1B 19kl>4nteracflng pro 
ESTs. Moderately sinular to A45010 
phosphontjosytglydnamide formyHransfer 
SWVSNF related, ntairix associated. acS 
dual-specifidly tyiosine-{Y)-phosphoiyl 
G antigen 7B 

hydroxysteroid (17-bete) dehydrogenase 
cydin-dependentldnase 5. regulatory su 
neurotrophin 3 



29.60 
27^ 
2BJ6 
20^ 
22.40 
19.60 
19.40 
21.40 

iiaoo 

25^ 
40£0 
24.60 
21.00 
33.40 
6080 
20.40 
29.40 
32.40 
27.40 
75,60 
31^ 
32.40 
a40 
61.20 
22.33 
23^ 
30.00 
2a80 
51.60 
33.00 
82.00 

69.33 
33.20 
31.60 
30.60 
23.40 
49.20 
2020 
20.80 
37.60 
53.40 
31.60 
28.80 



TABl£ 4B shows 8ie accession numbers for Ihose primeiceys lacking unigeneliys for TaWe 4^ Rjr each pr obeset we liave Dsted the gene cluster numljer trora which the 
ongonudeotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were cttsteied based on s«|uence 
similarity using Ousterlng and Alignment Tods (DouWeTwist. Oakland CMxmi^ The Genbank accession numbers for sequences comprising each duster are listed m the 
'Accesston' column. 

Pkey: Unique Eos probeset identifier number 
CAT number. Gene duster number 
Accession: Genbank accession numbers 



Pkey GATnumber Accessions 

123819 371681„1 AA602964AA609200 

126433 127143_1 AA325508 AA099517 N89423 ^ 

126872 142696 1 AW450979 AA136653 AA1 36656 AW41 9381 AA98435a AA492073 8E168945 AA809054 AW238038 BEOl 1212 BE011359 

BE011367 Ba)11368 BEOl 1362 BE011215 BEOl 1385 BE011363 

106851 322947J AI458623AA639708AA485W9R22065AA485570 

118720 genban}LN73515 N73515 

12)515 genbankuAA258356 AA258356 

117099 321871J H93699 H97976 H80036 

101447 entrez^M21305 M2130S 

123130 genbank>A467200 AA487200 



110 



wo 02/086443 



PCTAJS02/12476 



bitensay (AI), a txnnaSzad reflecfing 8^ 

Pkey: Unique Eos pfobeatideaffier number 

ExAocR ExanpIarAccessionnunter.Geitokacoe^ 

IMgeneSD: Lhigenenuntber 

R2: ^^i^^aS^^aoc^ltsma hng tumor samptes divided Iv the 90lh percenBe of Al far oomj^and 

ZISl^^h™«^4videdby9a^ 
iwmal bng. chronkaBy diseased lung and lunw sam^ 



Pkey ExAccn UnigencID UnlgeneTHte 



R1 R2 R3 



R4 



00035 






lUUUOO 






I0Q037 






lUlA/f 1 


A28102 




lUUl 


XQ2308 


Ks.82962 


[00154 


K60720 


HsJ1892 




D17793 


Hs.78183 


inniRfl 
luuioo 


AW947D90 


Hs^lOI 




8E294407 


Hs59910 


100216 


AA489908 


Hs.1390 




MM QQ1949 


Hs.1189 






Hs.1600 








lUUMU 




Hs.77152 






Hs^793 


1UU00U 


W70171 


Hs.75939 
















1 IsUUO 


Ks.10842 


iMAQI 


D56165 


Hs 275163 






Hs.11 






Hs.99949 






Hs.1640 


lULO/O 


X00356 




1UUD£9 




Ks.21291 


lUUDOl 




H5.13274fl 


inflRT? 

lUVOf f 




Hs.57813 


100696 


014687 


Hs.1 21686 


100709 


N2^39 


Hs.100469 


100761 


RP708491 


Hs^5112 


100830 




H5.4756 


100667 


U14622 




100902 


[Yl luV£r9 


Hs.287270 


100906 


AU076916 


Hs.5398 


100960 


J00124 


Hs.117729 


101045 


J05614 




101061 


NM_000175 


Hs.180532 


101071. 


102840 


Hs.84244 


101124 


L10343 


Hs.112341 


101175 


U82671 


Hs.36980 


101181 


BE262621 


Hs.73798 


101204 


L24203 


Hs.82237 


101210 


L29301 


Hs.2353 


101216 


AA284166 


Hs^113 


101226 


AA333387 


Hs^6 


101233 


AL135173 


Hs.878 


101273 


Z11933 


Hs.182505 


101342 


U52112 


Hs.182018 


101346 


A1738616 


Hs.77348 


101369 


NM_OO0892 


Hs.1901 


101396 


B6257931 


Hs.78998 


101431 


BEia5289 


Hs,1076 


10144B 


NA^000424 


Hs.ig5850 


101462 


A1J035668 


Hs.73853 


101466 


BE262660 


Hs.170197 


101484 


AA053485 


Hs^lS 


101502 


M2695B 




101605 


AA307680 


Hs.75692 


1015^ 


NM.002I97 


Hs.154721 


101535 


X57152 


Hs.99853 


101577 


M34353 


Hs.1041 


101649 


AW9S9908 


Hs.1690 


101663 


KMJ003528 


Hs^TB 


101664 


AA43G989 


Hs.121017 


101669 


L24498 


Hs.80409 



AFFXcontratGAPDH 

AFFXcortrokGAPOH 

AFFXconirotGAPOH 

Human GABAa recepfar alpha-3 adKufi 

Oiytrfdylalfi synthetase 

K1AA0101 gene product 

3ldo*elo reductase fan% 1. membef C3 

minictowiosome mantenance deficient (S. 

phosphofructddnase, platelet 

proteasomefpresome. macrop^) siAunil, 

E2F transcripfon factor 3 

chaperonin containind TCPf , subunit 5 (e 

protein disulfide isomerase^^ated prot 

mlniduomosome maintenance deficient {S. 



umfine monophosphate kinase 
K1AA01 75 sene product 
amylase, alpha 2A; pancreatic 
RAN, msxvba RAS oncogene {amily 
non^netastalic cells 2. protatn (NM23B) 
caicinoemtHyonic antigen^ated cell ad 
prolactoHnduoed protein 
coilagen. type VH. alpha 1 (epdennolys 
caldtonin/caldtonin-feiated polypepiid 
nutogen-activated protein Idnasa kinase 
Homo sapiens rit}osomal protein UQ mRNA, 
zinc ribtnn domain containing, 1 
general transotpSon factor (!A, 1 (37k 
n^elddAymphcad or nixedfneage ieukem 
KIAAOOlSgene product 
flap stnicture-spedfK endonuclease 1 
gb:Human transketolase-ia^d protein gene 
ret proto-oncogene (mulSple endocrine n 
guanine monpbosphate ^nthetase 
keraSn 14 (epidemioiysis Indlosa simple 
gb*>iuman proliferafingcell raidear anli 
glucose phosphate isomerase 
potassium voUage-gated channel. Shalne 
protease InHbitDr 3. sWn-derived (SKAL 
melanoma antigen, fiamily A, 2 
macrophage migration inhilMtDry fador ( 



opioid receptor, mul 
cycGn-dependent kinase Wabaor 3 (COK 
chaperonin contariirig TCP1. suixiniteA ( 



POU domaa dass 3, transcription bcto 
interleukin-1 reoeptor-assodated Mnase 



kaMn B. pl^ma (Fletcher tactof) 1 
proS^ating cell nuclear antigen 
small proline-rich protein IB (comifin) 
keratin 5 (epidemmlysis twDosa simplex 
bone morphogenetic protein 2 
ghitamiooxabacetic transarninase 2. mit 
udeiferDQ-uiduced protein wUh tetratri 
gteHumaDparattiynNdbormone-r^aledptD * 
asparaglnasynOietase 
aconitase l.sotuhle 
OirSlarin 

v-ros avian UR2 sarcoma vims oncogene h 
heparin-tanding growth factor binding pr 
H2B histone family, member 0 
H2A histone family, member A 
giwsflh arredand ftVA^lamage^dudUft 



aoo 



a84 

3.33 



2^ 



5.07 



3L10 
3.85 



7.20 



aeo 



7.60 



24.80 



2jsr 

3.12 

aso 

4.08 
153 

aso 

3.24 

a3i 

10.50 
4.02 



54.00 

S59 

7.00 



10.20 
&00 



12S1 



6.40 



15.65 



1450 

9l30 
20.60 



10.00 



R5 

&76 
5.77 
&75 

5.71 



4.52 
5.49 
5.67 

5i66 
3^1 
450 

4^ 
a79 

a49 
4.17 



21^9 
12^ 



38.80 
12X0 



9.09 



7.99 

5.16 

4w69 
4.19 

5.69 

7.90 
445 

4.17 

7.90 

4.01 

4.48 
4.6S 



7.60 



111 
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101695 


M69136 


Hs.1356ffi 


101724 


LI 1590 


Hs-62) 


101748 


NM_001944 


Hs.1925 


101759 


M80244 


Hs.184Q)l 


101771 


Ni!_002432 


Hs.153837 


101804 


MS36S9 


Hs.169840 


101809 


MB8849 


HsJ23733 


101833 


AU07e442 


Hs.117938 


101842 


&i93221 


Hs.75182 


101851 


BE^64 


Hs.82045 


102002 


NlyL002484 


Hs£1469 


102039 


AL134223 


Hs.306098 


102072 


U09410 


Hs.78743 


102083 


T35901 


Hs.75117 


102111 


L38196 


Hs.81684 


102123 


NM_0018(B 


Hs.1594 


102154 


U17760 


Hs.75517 


102193 


AL036335 


Hs^13 


102217 


AA82g378 


Hs.301613 


102224 


NA^002810 


HS.14849S 


102234 


AW153390 


HS.27B554 


102251 


MM.004398 


Hs.41706 


102305 


AL0432Q2 


Hs.90073 


102330 


BE238063 


Hs.77254 


102340 


U37055 


Hs.278657 


102348 


U37519 


H5.B7539 


102368 


U39817 


Hs.36820 


102394 


NJiL00381& 


Hs.2442 


102404 


NJL005429 


Hs,79141 


102S37 


U57094 


Hs.a)477 


10S81 


AU077228 


H5.7725e 


102e05 


A1435128 


Hs.181369 


102610 


U65011 


Hs.30743 


102623 


AW249285 


Hs.37110 


102642 


AA20S847 


Hs.23016 


102654 


AV849989 


HS.2438S 


102659 


BE245169 


Hs.211610 


102659 


071207 


Hs.29279 


102672 


U72056 


Hs.29287 


102687 


NM.007019 


HS.930Q2 


102696 


BE540274 


Hs.239 


102768 


U82321 




102781 


BE258778 


Hs.108809 


102784 


U85658 


Hs.61796 


102824 


U90916 


Hs.82d45 


102829 


NKt006183 


Hs.80952 


102888 


A1346201 


Hs.76118 


102892 


SE440042 


Hs.83326 


102913 


WAL002275 


HS.B0342 


102935 


BE561850 


Hs.80506 


102951 


X15218 


Hs.2959 


102S83 


BE387202 


Hs.1 18638 


103023 


AW500470 


Hs.1 17950 


103036 


M13509 


Hs.8318g 


103038 


AA926960 


Hs.3348a3 


103060 


NM^005940 


Hs.155324 


103099 


AI693251 


Hs.8248 


103119 


X63629 


Hs.2877 


103168 


X53463 


Hs.2704 


103165 


NM^006825 


Hs.74368 


103192 


M22440 


Hs.170009 


103223 


eE275607 


Hs.1 708 


103242 


X76342 


Hs.389 


103316 


X83301 


Hs.324728 


103375 


NM_005982 


Hs.54416 


103376 


A1J036166 


Hs,323378 


103385 


NM_007069 


Hs.37189 


103391 


X94453 


Hs.114366 


103404 


BE394784 


HS.78S96 


103430 


BES64090 


Hs.20716 


103446 


X98834 


Hs.79971 


103476 


Y07701 


Hs.293007 


103477 


AJ011812 


Hs.119018 


103478 


BE5149d2 


Hs.38991 


103515 


Y10275 


Hs,56407 


103558 


BE616547 


Hs,2785 


103580 


AA328046 


Hs.46405 


103587 


BE27Q26& 


Hs^28 


103594 


AI368680 


Hs.816 


103636 


m 006235 


Hs.2407 


103768 


AFQ86009 




103841 


AA314821 


Hs.38178 


103847 


AF219945 


Hs.102237 


103913 


AWg87500 


HS.133S43 


104094 


AA418187 


HS^IS 



1857 



1Z80 



12X0 



12X0 



aeo 



3.33 



14.00 
12X0 



12.80 



chymasa l.mastcsfl 4.79 

tritous iteicptsQcM erSg&i 1 (Z3Q^40kO) 15.21 

desrnglan3(psmpiqgusvulgans3iiQg8a 55.50 
sobte carier faniily 7 (ca&inic ansno 
myelod eel nudesr (^erenOaSoa snl 

TTKjrotein kinase 4.S) 

gapjinc&»pR)letn.beta2,2QkO(cofn 140.00 

coyagefl.typeXVll.s!pha1 2.56 

nis&kss (neurits gowtb-fvomoSng factor 
Qudeo&l&timdng protein 1 (£xoSB.{in 7.80 
ddo^ reductase famiy 1 , mamber CI 

zinc finger protein 131 (ctone pHZ-IQ 7.40 
mSerieuldn enhancer l&idingtactor 2, 4 

suUbtrans&rase family, cytosoGc, 2A. 
oenlroniare protein A (17kD) 
lamitsa, beta 3 (nksin (12ScO), kaOnin 
sacred AhosphopralBin 1 (osteopontin. 
JTVl gene 

pi-oteasome (prosome. macropain} 26S subu 
beterochromatin-Ske prot^ 1 
OEAO» CAsp-(3t^AtaVVsp/Hts) box polypep 
chromosame segreg^n 1 (yeast homalog} 
chmmobox homotog 1 (Drosophita HP1 beta 
(naaQphaga stimulating 1 (hepatocyte gro 
aldeti^e dehydrogenase 3 faRfly, member 
Btoom syndrome 

a disintegrin and metalloprateinase doma 19.20 
vasoilar eixlotfteria) growth factor C 
RAB27A, member RAS oncogene fansJy 
entiancar of zeste (Orosophli) homolog 2 
ubtquilin fusion degradafion 1^ 
preferentially expressed antigen in meia 
melanoma antigen. famBy A. 9 

G protein^oupled receptor 22.00 
Human hbo647 mRNA sequence 
CUG triplet repeat. RNA-binding protein 
eyes absent (ITmsophila} homotc^ 2 
letlnc^astoma-binding protein 8 
ubiquitin canter protein E2-C 
iorfchead box Ml 

gb:Homo sapiens done 14.98 mRNA sequenc 
chaperonin containing TCP1, subunit 7 (e 
transcription factor AP-2 gamma (acfivat 
Homo sapiens cDMA: FLJ21930 fis, done H 
neurotensin 

ubiquiOn carboxyUernvnal esterase LI 
matrix metalioproteinase 3 (stromelysin 
kermis 

small nudear ribonudeoproten polypept 
v-ski avian sarcoma vira! oncogene homol 
non-metastatic ceOs 1, protein (NM23A) 
multifunctional polypeptide similar to S 
matrix raelalloprotelnase 1 fmterstitla] 
CDC28 protein kinase 1 
matrix metalloproteinase 11 (stromdysin 
l^H dehydrogenase (ubiquinone) Fe^ pro 
cadhedn 3, type 1 . P-cadherin (placenta 
glutathione peroxidase 2 (gastrointesSn 
transmembrane protein (63kO). endoplasmi 
transtbmfting growth fador. alpha 
chaperonin containing TCP1, subunit 3 (g 

aicohd dehydrogenase 7 {dass IV), mu o lOaOO 
SMA5 9.80 
sine ocuEs homeobox (Drosopliila) homoto 9.71 
coated vesicle membrane protein 14.00 

similar to rat HREV107 11.00 
pyrrofine-S^rboxylate synthetase (glut 293 
pmteasome (prosome, macnipain) subunit 
translocase of inner mitochoodiid membr 

sal(DrosopHla)^2 21.40 
aminopepGdasepuroniydn sensitive 13.00 
transcription factor NRF &40 
S100 ca!dunvbtn(&ig protein A2 
phosphoserine phosphatase 
kerafin 17 

polymerase (RrtA) 11 (ONA directed) pdyp 
574 oncofetal trophoblast gtycoprntdn 
SRY (sex detennining region Y)hbox 2 
POU domain, dass 2, associating factor 
gb:Homo sapiens fuQ length insert cONA 
}iypothetk2lpft2tBinRJ23468 aOO 
tubby supar-CamBy protein 10.40 
ESTs 15.60 
ESTs 6.60 



&20 
162 
&85 



4.50 



a87 
15.91 



77^ 
12.50 



&50 
&50 



aoo 



464 

2.93 



3.01 
27.90 



4.05 
3.07 



5.02 
10S) 
6.41 

78.50 

6.51 

150 
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4.10 



5.83 
4.35 
5.12 



&1B 
4.49 
5.80 

5.15 
4.17 



4.57 
3.98 



14.40 
6.70 



11.40 



a80 



7.40 



9.24 
5.54 



178 
4.26 



aso 



7.26 



179 
4.27 



162 
4.70 



115 
198 



184 



4^ 
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104150 


AL122044 


Hs.331633 


l^pothelicaf pnotah OKFZpS6N034 


104257 


BE550621 


Hs5222 


sstPOQcn rocftptof tBoCBng assooate 


104261 


AW248B4 


Hs5409 


RNA pdymerase 1 sdsBiU 


104331 


A6040i50 


HS.27S362 




104415 


BE41Q992 


Hs.256730 


h&nt&"f69ul3iBd hbSsSqd factpf Z'^-^phs 


104558 




Hs.88^9 


hypotheScal (xotein UGC4316 


104590 


AVV373062 


HsJB3S23 


iKictetir f Bffiptof sofefafiifly 1, qtoii^ L n) 


1046S 


AA360954 


HsJ272d8 


Homo sapiens cfXNA: FU21933 fis, duie H 


104860 


BE298665 


H$.14846 


Hixno saptens mRKA; cONA (^CF^64O016 (fir 


104689 


AA420450 


K5.292911 


ESTSk H^Hy ^1ar to S60712 baid-&fr 


104754 


A12IK234 


H5.155924 


cAMP FBspon^VB donsit inodutsSor 


104758 


BES6Q289 


Hs.7010 


NP0002 protein 


104971 


6E311926 


Hs.15830 


liypotheScai profdn FU12691 


105011 


BE091926 


Hs.15244 


rmtsGc spindle ctSed-coO rel^ pFOt 


105012 


AF0S8158 


HsJ329 


chromosoino 20 open re3(Sn9 frame 1 


105026 


AAfi09485 


Hs.124219 


t^pothetlcal pnateih FU12934 


105076 


AI5g82S2 


HsJ7810 


hypothetical proton (yIGC14833 


105132 


AA148164 


Hs.247280 


HBV associated factor 


105143 


AI36aa36 


Hs^4808 


ESTs, Wealdy simnar to 138022 liypotlieti 


105158 


AW976357 


Hs.234545 


tiypo^tical pmtstn Klfi^ 


105175 


AA305384 


H&25740 


EROl (S. cere\nsiae)-l3ce 


105200 


M3281fl2 


Hs.24641 


Cfioskeieion associated protei/] 2 


105264 


AA227934 




^3r57e08^1 SoatesJIhHXSPu.SI Homosapi 


105298 


BE387790 


Hs.26369 


liypolhetical protein FU20287 


105409 


AW505076 


H&301855 


DiGeon^e syndrome critical region gene 8 


105460 


AW298078 


HsJ271721 


Homo sapiens, clone )MAGE:4 179986. mRNA. 


105667 


AA767526 


Hs.22030 


pared box gene 5 (B^ lineage specif 


105743 


BE246502 


HS.95S8 


sema domain, immunogh^xiiin domain (I9), 


1(^82 


H09748 


H&S7987 


&cdl CLL/^fnipttofna 11B(2sicfinQsrpro 


105846 


AW954084 


Hs^4951 


ESTs 


105691 


IK5984 


Hs^89068 


heat shock %VD pro&eon 1. alpha 


106019 


AF221993 


Hs.46743 


McKusick-4<aufman syndrome 


106069 


BE566623 


H&29899 


ESTs, WeaMy similar to G020n transcnp 


106073 


AL157441 


Hs.17834 


downstfcann nfi^Qhbof of SON 


106126 


AA576953 


Hs.22g72 


hypofhe&al protein FU13352 


106159 


AK001301 


Hs.3487 


hypotheiica) protein FU10439 


106220 


E}61329 


Hs 32196 


mitochondrial fit)osoma] protein L36 


106260 


AI097144 


Hs.5250 


ESTs, Weakly similar to ALUI.HUMAN ALU S 


106300 


Y10043 


Hs.19114 


hjQb-mobSity group [nonhistone ctiromoso 


106307 


AA436174 


Hs,37751 


ESTs, We^kif similar to putative p150 ( 


106318 


AA025filO 


Hs.9805 


cleavage and p(dyadenytatlon specific 


106341 


AF191020 


Hs.5243 


liypolhetical proteoi, estradtot*lnduced 


106440 


AA449563 


Hs. 151 393 


gkitamate-cysteine l^ase, catalytic sub 


106461 


D61594 


Hs.17279 


tyrosylprotein sulfotransfierase 1 


106586 


AA243837 


Hs.57787 


ESTs 


105605 


AW772298 


Hs.21103 


Homo safOBfts mRNA; cONA DKFZ)>564B076 (ft 


106654 


AW07548S 


Hs.286049 


pho^hoseiine arranotiansfefase 


106785 


Y15227 


Hs.20149 


deleted In lymj^ocySc leukemia, 1 


106813 


C05766 


Hs.181022 


OGH? protein 


106895 


AK001826 


Hs.25245 


hypotheGc^ prot^ FU1 1269 


106913 


Ai219346 


Hs.86178 
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para^iyroU bonnonB^kB honnano 






25.20 


1348Q3 


AO001528 


Hs.89718 


sj^nrsne syn{h3se 








134853 


BE268326 


Hs302c0 


S^anijrwiniMyoie^-cacfaoxanikteribonuda 








134859 


026488 


Hs.90315 


KIAA0007 protein 






a20 


134891 


RS1083 


Hs.30787 


ESTs 






7.40 


134980 


BE246400 


Hs.285175 


ace^ft-Cosnzyme A transportBT 


4.00 






134393 


BE409809 


Hs.301005 


purine-ndi eleaienl bindi^ protsin B 








135047 


AL134197 


Hs.93597 


cydovdependeni Idosse 5. regulatory su 


&50 






135080 


AI761180 


Hs.94211 


(oS\ (retjured for ceB dSisruifiaSofl, 


SlOO 






135103 


NiL00342fl 


Hs.d450 


zinc &iger proton 64 (KPF^ 




11.00 




135145 


AW014729 


HS.952S2 


nudear factor telaied to kappa 8 biRdiD 








135184 


U13222 


Hs.96028 


fo(t(h8adt)ax01 






7X0 


'135242 


A1583187 


Hs.97(M) 


cydinEI 


13.50 






135288 


AWQ234S2 


Hs.97849 


ESTs 


6.46 






135289 


AW372S69 


Hs.9788 


tiypotheScat protein MGC1(^24 siraQar to 




a8o 




135355 


AKQ01652 


Hs.99423 


ATP-dependenl RMA heGcase 


10.00 






135371 


NM-006025 


HS.S97 


pnsteasQ, serins; 22 


aoo 






13S393 


L1 1244 


HS.S9886 


comptement component 4-blnding protein. 









\2M 



PCTAJS02/12476 



4.58 
4.79 



4.48 



4.01 



14iD 



TABLE SB shows the anessionnumt)efS for ttioseprinneiGeysia^ Ry each probesetwB have listed the gene duster iunnber£rorn\s^ 

d^onudeoSdesweiedesisned. Gene chisterswereoornf^ using sequeim derived from Getib^ These sequences weradustered based on sequence 

sbivlarity using dusterb^ and Alignment Tods (DouUeTwist OaMand CadtSan^ The Genbank accession nunters tbr sequences comprising eadt duster are listed in (he 
'Accession' cdumn. 

Ptey: Unique Eos probeset identifief number 
CAT number Gene dusternumber 
Aocesskm: Genbadc accession numbers 



Pkey 

117079 
124305 
101502 
109792 
1^034 
102763 
126345 
127066 
127099 
119243 
125875 
112054 
126979 



CAT number Accessions 



122318 
114699 
114793 
108305 
108393 
100867 
123731 
109700 
120715 
113702 
115113 
101045 
108554 
108573 
119052 
126S22 
126605 
10376S 



1621717J 
242183 1 
18202.-€ 
754958.1 
1598157.1 
44641_1 
1653833_1 
1703458.1 
244301_1 
177479SJ 
1565433.1 
1538292^1 
171411.1 
880655.1 
292419.1 
135322.1 
150742.1 
111550.1 
113411.1 
ljgr.KT4586 
genbanKuAA609839 
genbanK.F09609 R)9609 
ganbanlLAA292700 
9enhanK.T97307 T97307 
genbanl^AA256460 
entre2.J05614 J05614 
genbankJ^A084948 
genbanlLAA066005 
149538 1 
416020 1 



H92325T97125 

AW983221 AA344870 AA344871 HS333f 
M2&956 
R4g625F10674 
K60340 N91637 
U82321 H66077 
N49713 N49819 W03810 
R25066 R20144 R20145Z43845 
AA347668 AW9S6810 Z44271 Fi07065 F07064 R13506 
T12603T12604 
H14480N98295 
R43590F10439 
AA210954AA211C07 
AJ809521 H12174242556 
AA429743AA442754 
AA127386R15644AA127404 
AA158245AA158235 
AA071391 AA069892AA069891 
AA075211 AA075245AA075126AA074946 
U14622 



AA292700 
AA2S6460 



AA084948 
AA086005 
R10889R10888 
W31912AI167491 
439280J AA676910 AA778853 AA778865 W868Q0 

46922.1 VW2667AI580740 AI690440 AI561350 AW467g06 AW1 51450 A1825927AUM1 716 AI885600 Ar742213 AW24e624 AI955498 AA033947 

AA845593 AI623711 N68S83 000054 AA193567 AW083S8S AW163216 AA1915S5 AA522778 AI628008 AI915518 AA843508 AI926195 
AA176265 AW167963 AA9921 15 W93647 AW103572 A1862994 AI342059 AA91 1 719 AA1 761 55 AA024712 AA059988 AA205591 A1591 107 
A1199673A]811756AJ275832A1422233A1191852AJ09fi682AIS80124AI683612AAS824S3AA927559AA488415T32414^^^ 
H44848 H20477 T91695 W47039 AA07005S AA024795 AA32a855 AA379248 AA379330 AA385580 W25920 W03688 AA448359 AA093881 
AW352477 AA089997 A1350265 W93479 N99689 AA932257 AW351469 H68590 AA663402 AA069771 AW087986 AI858420 AA600214 
AI970774AI857712A1583081 AI885584 AW1311S0 AI567981 AW002714 AW189973 AW075495AW168303AA953714AW516881 A1357375 
Ai566663 AW512676 AI570580 A1023690 AA448216 AI079853 A1422707 AA779516 AWQ26g?2 AW130082 AW162307 AW438646 AA709332 
AVV192394 AI167350 AI217879 AI1291 52 AA719509 AI350480 AA663418 At003634 AW118546 AA180281 AA442B33 Al^^ 
AI038759 AA846723 Ai248770 AA993694 A12B0335 AI885107 AW516649 AA641563 AAS9583S AA582S21 AI276744 AA436478 AI017350 
Ai620763 A1859887 N73926Ai076327 AJ741615A1160617 AW172819 AI492005 AA677429 AA996334 A1593771 AI950039 A1245629 A1288515 
AI856185 T93293 AA1732S2 AA599779 A1680092 AW439316 AI084555 AI272672 AI583507 AW473219 AA738132 AW473283 AJ367492 
AA995410 A!669S24 AA206353 A103309S AI040382 AA873S30 AI221074 AI934840 Ai418S80 AA844306 R94503 AA773520 AA843169 
AA219425AA62g8S8Al811719AVU411275A1590981VV37907AlS91178A]684051AAS83238AA669347AA^ 
AI884391AI241580A1003539AW176687AAOOg850 N34566A1333493AI188070AA070827AM11683AI280^^ 
AA021576 N71953 A]885888 AW076039 T15777 AI537673 AW248048 H09554 W93480 W47001 AW079114 AA063160 AA7574S3 R60788 
At859431 H20478 AA218882 AA757465 AA100995 A1854135 AI934209 AA070S03 H47008 AA219646 W61039 W93907 AW385050 W37967 
W78028 AA189007 AA479136 R93650 AA442312 T30287 AA847628 AA180262 AA00g649 C03892 AW149464 AA31 0963 AA219693 
AA069747 R29207 AA094784 AA29361S AA447848 AI984167 N90393 C05097 N56499 AW292351 AW149681 AW473258 AA629322 A)004409 
AVW05577AI954937AI811070AA902422AV«14437AA5354€0AA916877AW517122AA974657AA975649AV«^ 
W07688AA193645AA378994AA48g273F32267W39303AA021181 N88810AA406524 AA062S53AA436801 H()8985H15g79N40310 
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AA209340 N56174 N83374 AAt91088AW247^ AA24a)13 AADS31U AA9725^AW293994 M375893T12t39 W28186 AW243849 

AI2S8S29 AA843996 m5260A11882aS AW248079 R15838 
1195^ gecbanJLW45552 W45552 
tl2382 gsiib3n)L859S04 R59504 
10S264 genhanIU^A227934 AA227934 
100071 easajSS\(l2 A23102 
123315 714071J AM96339AA498646 



TdteOAshoas 99 genes up^ufaJfidfwnsmto ^^sanesweresdectedjon^ 
Die relative M of mRNA expression. 



Pkey: Unique Ecs probeset iden^fier number 

ExAccrc ExernplarAccessionnuinber.Genbank accession rnnnter 

UnigenelD: Unigene number 

Un^eneTtlle: Unigene gene title 
R1: 
R2: 



aJage rfAi for samples fiomnai«>oke»swiBi a 
SeofAJ for san^ftomnofrsinoteisfnih squama 
caidnoma 



Ptev 
nay 


ExAccn 


UnigenelO 


100971 


BE379727 


H&83213 


101174 


L17330 


Hs.280 


101296 


Y12490 


Hs.85092 


101304 


AA001021 


Hs.6685 


101806 


AA586894 


Hs.1 12408 


101972 


S82472 




102274 


U30930 


H5.158540 


102394 


NiyL003816 


Hs.2442 


102832 


U92015 




103010 


XS2S09 


Hs.161640 


103439 


X98266 




103563 


U02911 


Hs.150402 


103857 


A1076795 


Hs.45033 


104233 


ABG02367 


Hsu21355 


104^0 


AW373062 


Hs.83623 


104907 


AA055829 


Hs.1 96701 


106131 


BE514768 


H5^6244 


106672 


H47233 


Ks.30643 


106872 


T56887 


Hs.1 8282 


IUvmOU 


AA1 56238 


HS.32S01 


10S971 


Z43846 


Hs.1 94478 


107982 


AAfl3S375 


HS.S7887 


108562 


AA1 00796 




10RS99 


AB0t8549 


Hs.6932a 


108663 


BE219231 


Hs.292653 


109247 


AA314907 


HS.859S0 


109630 


R44607 


Hs.22672 


110193 


AI004874 


Hs.310764 


110234 


H24458 


HS.3203S 


110644 


R94207 


H5.288989 


110886 


AW274992 


Hs.72249 


111057 


T79639 


Hs.14629 


111950 


AF071594 


Hs.110457 


112291 


R53972 


Hs.25026 


112956 


Z43784 


Hs.75893 


113009 


T23699 


Hs.7246 


113060 


BE564162 


H5.250820 


113073 


N39342 


Hs.103042 


113074 


AK001335 


Hs.31137 


113121 


T48011 


Hs.8764 


113125 


AA968672 


Hs.8g29 


113757 


AA703095 


Hs.1 6631 


113848 


W52854 


Hs.27099 


113884 


Ai333076 


Hs.28529 


113936 


W170S6 


Hs.83623 


114875 


AA235609 


Hs.236443 


114987 


AA251016 


Hs.87808 


115460 


AW958439 


Hs.38613 


115722 


W916g2 


Hs.5960g 


116261 


AA481788 


Hs.190150 


116830 


H61037 


Hs.70404 


116970 


AB023179 


Hs.9059 


117178 


H98675 


HSJ269034 


117757 


AF088019 


Hs.46732 


118283 


AA287747 


Hs.173012 


118384 


AF217525 


Hs.49002 


118657 


A1822106 


Hs.49902 


120328 


AA923278 


Hs.290905 


120404 


AB023230 


Hs.96427 


120524 


AA2618S2 


Hs.192905 


120888 


AW20755S 


Hs.97093 



Unigene Trtte ^ 

fsUy add binding protein 4, adipocyte 

pre-T/NK cell associated pnstein tS.00 

thyroid hormone receptor intaradDT 1 1 

ti^rroid hormone receptor interactor 8 

S100 cafckjm-bintfing proten A7 (psorias 

gb:b8ta -poNDNA poiymerase beta (axon a 

UDPgIycosyltransferase8(UDP-gdbdose 7.50 

a distntegrin and mBtaDopralQinase doma 7.50 

gbiHuman done 143789 defiedive mariner 13.50 

tyrosine aminotransferase 9-50 

gb;H.sap(ens mRNA for ligase like p(Ol» 

acfivin A receptor, type 1 9-00 

laorimd proline rich protein 

douUecortiR and CaM ldnase«e 1 13.50 
nudear receptor subfamily 1, group I, m 
ESTs.WeaWyslmilartoALUlJiUMANALU 16.50 
SNARE protein 
ESTs 

KIAA1134 protein HSO 
ESTs 

Homo sapiens mRNA; cDNA DKFZp43401572 (f 9.50 

ESTs. Weakly similar to K1AA0758 protel 

gb2m26c06.s1 Stratagene pancreas (93720 16^ 

MD-2 protein 13.00 

ESTs, WeaMy similar to T26845 hypot)»ti 

ESTs 7,00 

ESTs 

HomosapiensmRNA:cDNAOKFZp434M082{fr 1230 
EST 16.50 
ESTs. HigWy similar to type H CALM/AF1 8.00 
thrae-PDZ containing protein similar to 17.00 
ESTs 16-50 
Wblf-Hirsciiliom syndrome candidate 1 1 1 XO 

ESTs 

anl^yrin 3. node of Ranvier (ankyrin G] 
ESTs 

hypotheiicd protein aJ14a27 9.79 
nMubute-assodated pnitein IB 32.50 
protein tyrosine phosphatase, leceptort 
EST 

hypothetical protein FU 1 1362 19-50 
ESTs 

hypothetical protein HJ23293 similar to 6.00 
chromosome 12 open reading frame 2 
nudear receptor subfamily 1, group I, m 

Homo sapiens mRNA; cONA OKFZp564N1063 { 

EST 
ESTs 
ESTs 

ESTs 9.50 
ESTs. Wb^similarloALUajiUMAN ALU &50 
KIAA0962pratfiin 7.50 
ESTs 

EST 7.50 
ESTs. WeaMy similarloA46010X-IInked 16.50 
Down syndrome cell adhesion molecule 
ESTs 

ESTs. WeaWy similar to protease (Ksapi 
K1AA1013 protein 7.00 
ESTs 6.00 
Homo sapiens cONA: RJ23004 8s. done L 17.92 



R2 

a64 

2.46 
12.00 
268 
211 



250 

3.94 
12.66 
217 

238 
295 

240 
5.00 



3.00 
279 
4.S0 



3.82 
221 

2.65 

6.00 
4.63 
7.00 
600 
227 
9.00 



268 



250 
239 
3.S0 
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121SS8 


AM12497 






121676 


K56037 


Hs.108145 


CO IS 


121936 


A1024600 


Hs.98612 


ESTs 


121938 


AA428659 


HS.9B610 


PQTc 
CO IS 


122177 


AA435789 


Hs,98833 


Col 


123442 


AA2K652 


Hs.111496 


UnmAcsrvoncrnMAPIillA&lfe A\t»UP 


123551 


AA608837 




nK>sB1!^h17c1 5«rurM fp^fis NHTHonvman 


123756 


AA609971 


Hs.l 12795 


COT 
col 


123851 


AA&20840 






124371 


N24924 


Hs.188601 


CCTc 
Co IS 


127477 


BE328720 


Ks.280551 


CO 15 


127591 


A)190540 


Hs.131092 


ESTs 


128252 


AM5S924 


Hs.192228 


CCTe 

cots 


128426 


AI265784 


Hs.145197 


CCTc 
CO IS 


128325 


R67419 


H&21851 


Um»ia »«Manc f^nUA CI l19Qf)Afe Avntt NT 
nOnK> SSptcnS CUI\A rU 1 caW iBi wm n 1 


128345 


A1990S06 


Hs.8077 


Unmn tstripfifi mRMA.' d)NA DKFZlfi47C184 ffr 


129105 


AI769160 


Hs.108681 


Homo S2pi6ns brsn tufnor flsf>fl^i^^tod pfot 


129235 


AW977238 


Hs.126084 




129506 


AB020684 


Hs.11217 


VXAAOSm nmfffin 
IWvwOf / praiBUi 


129595 


U09550 


Hs.1154 




130160 


AA305688 


Hs.267695 




130340 


D82326 


Hs.23gi06 


SOUHB CantcT lanaiy o (cysunci oumqi 


131220 


ABQ23194 


Hs^0855 


VIA An077 fvnlfaift 


131430 


A1879148 


Hs:»770 




132114 


NMJ006152 


Hs.40202 




132458 


AA935315 


Hs.48^ 




132647 


NM„006327 


Hs.54432 


OidjyuicMKMtaoaa ho ^ucuryaiaMuotuaac 


132655 


D49372 


Hs.54460 


tfwtii/Hhtci />vfnIrinA eiihfamtk/ A /f^ 
Sfiidii uiuugtilB byuAUia auuloiiujy n \\^ 


132682 


AI077500 


Hs.54900 


ScIuXjQiCSUy UolulcU CWmI mHIMH olHig 


132747 


AA345241 


Hs.55950 


CO IS, Vfooiuy Miiiuai vj iMfVMwtmyiwBwi 


132812 


R50333 


Hs.92186 




133337 


AF0859B3 


Hs.293676 


ESTs 


133875 


AL134906 


Hs.771 


phosphorylase, glycogen; Over (Heis dis 


134119 


AW157837 


Hs.79226 


fasdculalion and elongation protein zet 


134464 


AA302983 


Hs.239720 


CCR4-N0T transcriptioft complex, subunlt 


134542 


M141S6 


Hs.d5112 


insufin-like growth factor 1 (somatomedi 


135002 


AA448S42 


Hs.251677 


GantigenTe 


135305 


AA203555 


HS.9&288 


Homo sapiens cONA Fyi4903 fis. done PL 



PCT/US02/12476 



295 



taoo 
i&oo 

1400 

8^ 

13.04 

11.50 

11.00 

6l50 



7.00 



laoo 

15.50 

6.50 

20.00 
11.50 
17.50 
&10 



7.50 



87.00 



2.50 
4.33 

2.08 
2.11 



4.25 

laoo 



6.15 
5.58 

253 
250 
283 
3.82 
SlOO 
3.00 
206 
227 
11.50 

&50 



TABl£ 6B show the accession numlwrs for those primekeys lacking unlgenelD's for Tatile 6A. For each probeset we have listed the gene duster number from which the 
digonucleolides were designed. Gene chisters were com(rfied using sequences derived fhraGenbank ESTs and ntf^ These sequences Mredustered based on sequence 
similarity usiiq Oustering and MgnmentToob (DouWeTwisl. Oakland CaTifiDmia). The Genbartk accession nimbeis for sequences cocnpiising each duster are Dsled in the 
"Accession' cdumn. 

Pkey: Unique Eos probeset idenlifier nurrrisr 
CAT number Gene cluster number 
Acoessjon: Oenbank accession numbers 



Pkey 

108562 
103439 
123551 
123861 
102832 
101972 
121558 



CAT number Accessions 

36375 1 AA100796 AF020589 AA074629 AA075946 AA100849 AA085347 AA126309 AA07931 1 AA07g323 AAD85274 

3S330J X98266 N41124 

g8nbankJ\A608837 AA608837 

genbanK.AA620840 AA620840 

enlr82^U92015 U92015 

entre?.S82472 S82472 

genbanU\A412497 AA412497 
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Table 7A shows 98 gsnesctoamegutatednnan^ni^ Tbese genes was selected inxn 59830 probesels on Sie 

Bos/Afiymetrix Hu03 Genechip array. Gene expcBsskm dajafor eadi pndiesel obtazoed fnsn Qus analysis W3S expressed as fffer^ mlenaiy (AI), a normafized vabe re^ecSDg 

Si8 (dafive iave! of m5^ expression. 

Ptey. unique Eos ptobesetideniSer number 

ExAocn: Exennter Accessio n nunAar, Genbaak accesaon lannbef 

UrosenelOt Unjoenenundier 

Ib^gens TMte: Ibugene gene liBs 

R1: 90th percenOeofAlfiar samples from srnokafSwiSiadenoca^^ 

SOdipenenSteofAlfiorsaaitefiamsniokeiswiftsquanttits^ 
caonotns. 



riwy 


ExAocn 


UrasenelD 


UrigeneTffle 


R1 


R2 




017793 




a^o-keto redudase femily 1, member C3 




154.10 




082343 


HS.18S51 


fleuiobtasionis (neive fissue) protein 




77.40 


100576 






^^^^Hpinji^iyi^l^^lQpyij^flilfl^ pdypepbd 


102.40 




iwar i 


PC97Q797 






463.80 




10t04S 


KOI 160 




bmcEng pPotBUi . ipocyta 


67200 




1UIU!>0 




Hs.889 


flhamlj AvriM crv^tel Diatein 


66.00 






U62671 


He 3gQSf) 


mf^anomA antloen famih/ A. 2 




77^ 


101497 


W05150 


Hs.37034 


homeo box AS 


62i0 




1UIDOJ 




Hs.2178 


ri£0 IMOlvilw ICIIIuji lllw4IIUd vC 


78.00 




101677 


NM 000715 


Hs.1012 


complement component 4-Und8tg piotdn, 


186.20 




101745 


M88^0 


Hs.150403 


dop3 decarboxylase (aroroafe L^arTuno aci 


8a08 




101941 


S77583 




gb^4ERVK1Q/HUMM7V revase transcriptase 


99.20 


mio 


102125 




Hs^8215 


si^yftfansferase 




102242 


U27185 


Hs.82547 


f etlnoic arid r^epto respcnder ^azaro 


67.00 






U37055 




macfoohaae stimutafina 1 /heoatocvte oio 


71X0 






U39840 


Hs^9867 


hpnaioct/fp rarifisr factor 3. atnha 




69.70 






Hs,2359 


Hi id enpTifiriiv ntuvtnhflt^KP d 


15100 


65.70 




U71207 


L|« 0Q97q 


pvAe flKeAnl fnmenf^lA) hnnvdm 9 






ALU I a&Hu 


Ue i07niQ 


evmrd^n* Himilnailn intRfadina nntei 

Ojlll|AOIMIf| IHMUUiyUM KimOTnHiy ymWM 




58.80 




iMw^UUOitiO 




neurotensin 




268,80 


103207 






QD.nunisi pnuugengua icuuvuus nuwn lui 


70.00 


212.10 


103242 




ris.007 


•i1i«nl«nl Waliu/lmminaeA 7 fi<laee RA mi A 

awonoi uenyurogenasc r ^Moaa ivj, niu o 








Hs.3155 


i*deMfi ofnho 




130.70 










64.60 








no* 1 1 


KtAAQ300 nrotetn 


ssiso 




104252 








$^ 










SOiU(o (vdlTlDi IdlTUiy H| 9U0IUII1 iNCoiUUtl 


94.40 




105024 






ESTs 


63*20 




106260 


A 1/197144 




F*;T'^ WflaWv simlbtr to ALU i WWiJi ALU S 

to 1 Dv vvcoiuj initial UJ />uu i^nu4¥4mY rxkmSJ w 




74.60 








nlirt9mat<Vi/*vetptna Itfuep rsitnlufir eiih 




71.10 


IU09Q0 






nh'(ifi111R016F1 MlH MGC. 17 H^nasaofinse 


73.20 






AW7759Qfl 


He 91103 




83.80 




106614 


AAfi4RL<3 


HS-33SQ51 


hvnnthptir^ nmtnn AP3fl19?9 




62.30 


106654 


AW07S4AS 




nhnenhnepfinn pminatransfistasA 




202.40 


105999 


H932S1 


Hs.10710 


hypothetical protein FLJ20417 




89.60 


10R700 
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TABLE 78 shows the accession numbers for those prtaekeys lacking ungendO's for Table 7A For each prtAeset we hare isted the gene duster number from which the 
qffjorBidp^Hps wsTQ designed. Gene cfaistBfS were compaed using sequences denved from Genliank ESTs and mRNAs. These sequences were cftstered based on sequence 
sindlari^usirig Qusferir^ and ABgnmentToote (OndileTwist. Oaldand Caliibmia). The Genbank accession numbers for sequences comprising each duster are listed in the 
'Accession* cdumn. 

Pkey: Unique Eos probeset identifier number 
CAT number. Gene duster number 
AooessioR: Genbank aocession numbers 

Pkey CAT number Accessions 

103207 30635 -4 X72790 

106566 120358 1 BE298210A!672315AW086489BE2g8417AA455921 AA902S37BE327124R14963AA085210AW274273AI333584Ai3^^ 

AI885095AI476470 AI287650 At8852g9 Aig8S381 AW592624AW340138AI266556 AA4S6390 AI310815AA484951 

116571 genbanK_D45652 D45652 

118466 genbanl^N66741 N66741 

101048 entmzJ<01160K01160 

101941 entre2_S77583 S77583 

103351 entrez_X89211X89211 

123130 genbankJ\A487200 AA487200 
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endogenous retixTviral protease 
guawe mideofhie tsnding protein (Gpr 
phospho!nositide-3-lanase. Fegula&uy sa 
S100 ca!ciun>tindmg protein A4 (catdum 
zinc (oiger protein 9 (a cellular retrov 
tryptophan 2,34ioxygenase 
hep^xyte nuclear factor 3, alpha 
(NONE) 

gb:Human RP1 homotog niRNA. 2f\nR region 
Honno sapiens cOMA: i=U21930 lis. done H 
plasmuiogen activator, urokinase 
\Mt;\Mn carlwxyMermina) esterase LI 
inte^rin. beta 4 

reticulocaibin 2, ERiand calcium bindin 
S100 cakaum-binding protein A2 
junctbnptakogiobin 

ESTs. Weakly ssnilar to AUJ7J1ilMAN ALU S 

ESTs 

ESTs 

Hoino sapiens cONA FLI1 1570 lis, done HE 

tntegrin,beta8 

ESTs 

Homo sapiens voltage-gated sodium channe 
receptor (calcitonin) activity modifying 
diop-o^hjcose 4.&dehydratase 
Homo sapiens cDNA FU13103 Gs. done NT 
ESTs 

hypothetical protein FU2066S 

ESTs, Modcratelysimaar toALU7.HUMAN A 

ESTs 

ESTs, Weakly droliarto transfiDRnatton-r 

ESTs 

ESTs 

endogenous retroviFal piotease 
hypothetical protein FU 1 4033 similar to 
ESTs 

a disintegrin and metalloproteinase dome 
ESTs 

hypoth6ScalproteaiKIAA1165 
ESTs 

Homo sapiens cONA FU1 1883 lis. done HE 

hypothetical protein 

ESTs 

gb:yg87b07£l Scares inCanibfain 1N1BH 

ESTs 

ESTs 

Human DNA sequence ftoffl PAC 75N13 on chr 

ESTs 
(NONE) 

glyooprotein (transmembrane) nmb 
dachshund (DrasophBa) homdog 
ESTs 

Homo sapiens cONA FLl 13495 fis. done PL 
K1AA1462 protein 

anterior gradient 2 {Xenepus laevis) horn 
hypoQieteal proton FU11088 
NAOPH oxidase 4 
ESTs 

ESTs. Moderately s*niIl3rtoALU7.HUMAN 
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4.98 
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076 
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1.15 
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0.94 


1,67 


1.17 


0.46 


1.07 


1.07 
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0.97 
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0.17 
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aoo 
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3.91 


1.49 


1.15 


1.03 


2.83 


4.79 


Z08 


1.54 
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1.35 


1.87 


1.55 


1.83 


1.30 


1.54 


1.15 


1.39 


1.19 


183 


1.13 


1.25 
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15.50 


29.07 


1.00 


1.00 
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1.35 


0.12 


1.40 
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1.65 
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071 


123 


1.66 


1,52 


0S2 
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037 


097 


0.78 


084 


023 


117 


037 
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1.60 


1.00 


093 


1.16 


1,02 


1.03 


0.24 


088 


008 


1.31 


1.29 


1.26 


048 


0.96 


029 


074 


099 


156 


1.24 


1.00 


075 


1.03 


too 


175 


O04 


10.68 


0.80 


096 


2.63 


4.29 


1.78 


2.71 


1.00 


101 


1.70 


180 


1.20 


119 


031 


1.30 


Z09 


241 


0.72 
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009 
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1.02 
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1.03 


1.23 


1.40 


1.00 


1.80 
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1.65 
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3314S0 N32912 Hs^13 

331531 N31343 

331547 N54811 

331578 N67960 Hs^499S9 

331589 N71027 Hs.152618 

331608 rfflSBSt Hs.112110 

331614 N92293 Hs^4fl272 

331658 mSJOl Hs.58030 

331671 W72033 HS.134S95 

331676 W79834 Hs.58559 

331581 W85712 Hs.119571 

331692 W93592 Hs.152213 

331717 AA190888 Hs.153881 

331718 AA191404 Hs.104072 
331811 AA40450O Hs.301570 
331820 AA405970 Hs.97996 
331831 AA412031 Hs.97901 
331852 AMiasaS Hs.98314 
331943 AA453418 H&21275 
331959 AA460702 Hs.82772 
331990 AA478102 Hs.139631 
332002 AA482009 Ks.105104 
332027 AA489671 Hs^l 
332029 AA46S897 Hs.145a53 
332033 AA48S840 Hs.251014 
332048 AA498019 Hs.201591 
332071 AA598594 Hs.205293 
332074 AAS99012 
332083 AA600200 
3320B5 AA600353 
332125 AA60986t 
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332180 H03348 
332185 H10356 
332203 H49388 
332232 N48891 
332240 N54803 
332261 N70294 
332275 R08838 
332280 R38100 



PCTAJS02/12476 



332304 R74041 

332314 T25862 

332384 Ml 1433 

332434 N75542 

332445 T63781 

332453 100205 

332458 M33493 

332504 AA053917 

332525 M17252 

332530 M31682 

332535 N2Q284 

332S39 AA412528 

332559 M13^ 

332563 N92924 

332565 AA234895 

332594 AA279313 

332634 S369S3 

332638 AA283034 

332640 AA417152 Hs.5101 

332654 AA001296 Hs^88217 

332665 AA223335 Hs.63788 

332692 AA496035 Hs-247926 

332716 100058 Hs79070 

332736 L13773 Hs.l 14765 

332758 X93921 Hs^938 

332781 AA233258 Hs.247112 

332792 

332816 

332858 

332906 

332911 

332912 

332922 



Hs.155546 
Hs.173933 
Hs.312447 
Hs.101433 
Hs.7327 
Hs.101689 
Hs.317769 
Hs.101915 
Hs.324267 
Hs.269137 
Hs.26530 
Hs,14638l 
K5^1201 
Hs.101539 
Hs.101774 
Hs.101850 
Hs.289068 
Hs,11112 
Hs.1 11758 
Hs.250700 
HS.1510S 
Hs^78430 
Hs.1735 
Hs.19280 
Hs.20183 
Hs.166189 
Hs^74407 
Hs^272 
Hs.3239 
Hs.283750 
HS50640 
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gb:yzl5g044l SDaresjnu!^_8Cferasi5_ 
gb»}74f04^t NGLCGAPjOv2Hoinosai£ens 
ESTs 
ESTs 

P7TXn7pn]teia 

EST 

EST 

ras homotog gens femSy. menter I 
ESTs, WeaWy sirrilar to rtioteJon [Mjiubc 
collagen, type Dl. aJpha 1 (Ehle»s4)aril 
whigtes$4ype MMW ntegraSoo si!8 fami 
Koino sapiens NY-REfM2 anSgen mRNA. par 
ESTs 
ESTs 

transcnpSon tBrnuoaSon facto, milac 
EST 

Hm sapiens inRNA: GO>tA OKF^SBSL0 1 20 (r 

hypoQnScalpfOleinFUIlOII 

odlagen. type XI, alpha 1 

ESTs 

ESTs 

hypoOieScal protein FLI20073 

ESTs 

EST 

ESTs 

K1M1211 protein 

gb:ae41e1U1 GesslerWibnsluinor Homos 

KJAA1080 protein: Gdgi-associaled, ganun 

nudeartactorl/A 

ESTs 

ESTs 

claudin 1 

ESTs 

EST 

Staiganlt disease 3 (autoGomal donunanQ 
ESTs. Weakly similar to putative p150 [ 
ESTs 

serum deprivation response (ptuispliatidyl 
RNA binding motif protein, X chromosome 
nedifl 3; DKFZP566B0846 protein 
ESTs 

tiypothetical protein FLJ23045 

retlnol^nding protein 1, celhiar 

Homo sapiens cDNA RJ1191d fis. done HE 

ESTs 

keratin 6A 

t;yptaset)^1 

duomosome 14 open reading frame 1 

cytochrome P450, sutrfamily XXIA (steroid 

inhib'n. tieta B (acfivin AB beta poiypep 

cysleine-rich motor neuron 1 

ESTs, Weakly similar 10 AF164793 1 prate 

GylQteralln2 

protease, serto. 16 (thymus) 

E1A binding protein p30Q 

mettiyl CpG tundlng protein 2 O^syndr 

tenascin XA 

JAK binding protein 

protein reguJator of cytokinesis 1 

hypothetical protein A/.GC2941 

propionyl Coenzyme A carboxylase, beta p 

gap junction proton, alpha 5, 40kO (con 

v-myc avian myelocytomatosis viral onoog 

myeioici/lymptioid or mixed-Toieage leukem 

dual specificity phosphatase 7 

hypothetical protein FLi10902 
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337043 
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11.53 
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16.14 
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11.41 
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337194 


1.88 
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337229 


0.22 


1.03 


337268 
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337299 


3.23 


5.14 


337325 
337389 
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10.42 


337493 
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337500 
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1.00 
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1.00 
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053 
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338322 
338357 

3383ffl 
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338414 
338418 
33846S 
338501 
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338523 
338549 
338561 
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338676 
338726 
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338804 
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4.10 
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a69 
a40 

a47 

&12 
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&2a 
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aio 
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ai7 
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ai2 
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1.00 
4.30 
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1.00 
1.00 
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SlII 
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Z70 

0.81 

1.46 

0.91 
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6.88 
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TABLE 8B shows the accession numbere for Oiose Pkeys in Table 8A lacW^ For eaAprobeselw have listed the gene duster number from i^ 

oTigonucteofides were designed. Gene dusters were compaed using sequences derived from Genbank ESTs and mRMAs. These sequences were dustered based on sequence 
similarily using austering and Alignnient Tools (DouWeTwIst. OaWand CaJiJomi^. The Genbank aajession numbers for sequences comprising eadi duster are listed m ttie 
"Accession' cdumn. 

Pkey: Unk^ue Eos piobeset Ueni^ number 
CAT number. Gene duster number 
Accessfon: Genbank accession numbers 



Pkey 


CAT number Accessk)ns 


322044 


187363J 


AW340926 AA249063 N88075 


322060 


44320 1 


A1341937 AW003O63 U34725AA904742 


321430 


42705.1 


X57414 X57415 


321467 


43034_1 


X13075X13076 


322125 


46779 1 


R93901AF075073 R93902 


322166 


46861.1 


H69434AF08S958 H89846 


322173 


46873.1 


H52567K5^AR)85970 H52164 


322178 


46882_1 


H56535AF085980 H56712 


322179 


46885 1 


H92891AF085982 H92777 


321577 


1615102_1 


H84649 K84252 H84260 H86664 HK320 


321587 


1615333.1 


895531 H95521 H84S29 


313723 


111953J 


AA070412AA102346AAOS188S 


320997 


62749^.1 


H22544 H46842AI204929 


322278 


47271 1 


W69304AFOS6283 W89200 


321687 


218439.1 


AA625149 AA313030 AA313052 H97463 


313883 


129439„1 


AA665089 AA135130 AA48405g AA10241 9 AW877765 


322320 


47422-1 


W791SOAF086419 


322339 


ei4584J 


AI666646AI734214W17348 


314648 


293660.1 


AW97926aAA878419AA431342 M431628 


300201 


682222.1 


A1308300A1308296 


306897 


25196..2 


AI093967 


323155 


979809.1 


AL120701AL135041AL121524 


322527 


38927.1 


AF147359 T58511T58550 


322585 


47376BJ 


W88919W89125 


300362 


1574395 1 


242308 H23S14 


322635 


82296 1 


AA005129 AA67g084 AA694399 


322664 


85042_1 


AA01 1522 AA702841 AA01 1691 AA330797 


315454 


380580 1 


AI239464 AI239473 AA625812 AI208703 


322687 


3737^1 


AF074666 Ai 1 10759 AR)90902 


314852 


327472_1 


A1903735AA491283A{6949S3AW97e903AA761362 


307783 


697809J 


AI347274AW844024 


324072 


269032.1 


AA381722 AA381829 AW963906 AW963902 AA381242 


300627 


221345_1 


AA488472 W27363 AA317053 BEG826B9 AW967036 BE079872 


323505 


195389J 


AW970512 AA280251 AI652287 BE466438 AI650725 AA551854 AA281574 AW571481 


315791 


403558.1 


AA678177AA677034 


324303 


233842.1 


AL118754AA333202 H38001 


316519 


442885J 


AA847835AA768376 


300926 


333127J 


AA504860AAS04911 
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3245SO 
301682 
324S04 
324883 
302697 
302711 
302742 
318499 
310624 
302847 
304122 
303598 
311409 
312094 
319312 
319407 
319425 
320007 
320018 
319484 
318855 
312220 
319546 
312389 
319611 
312437 



311896 
319834 
321102 
321158 
321199 

305528 
321270 
314126 

320714 
306442 
306446 
306458 
306510 
306557 
306572 
306562 
308656 
306686 
3Q6751 
308011 
306892 
308106 
303154 
306956 
306958 
308213 
308216 
308219 
308588 
308599 
308643 
308673 
308697 
308778 
308808 
308875 



308966 
308979 
303011 
303077 
305016 
305034 
305072 
305148 
305190 
303978 
303990 



305235 
305312 
305413 
305447 
321244 



^1 
2JSSS7J 
398033J 
1515973J 
43219J 
45419 1 



384430.1 

34824.4 

458.105 

77271.^ 

270283.1 

8372S4J 

797889_1 

1540116.1 

1688823_1 

1689571.1 

229683.1 

1815987.1 

1691553.1 

1535937.1 

1671607J 

243305.1 

90aJ67_l 

1566863.1 

291472.1 



579192.1 

112523.1 

80531.1 

410938.1 

212379.1 

2883^-3 

1662057.1 

177666.1 

743644.1 

AA976899 

AA977348 

AA978186 

AA988546 

AA994530 



AA492588AA492496AMd2571 
T78054 T7S888 AA398165 

Al6925S2AI393343A»»S10Ai377711 F24263 AA661876 

031010 D3Q^ 031168 D31166 031465 

Ai001409AJ00141D 

103442051348 

L12061 

T3451 AAS85296AA585305 

U88896 U88^AA916056 T03285A1341594A1353534AI634Q31Ua8397 

X^l XS3942 X98943 XSa953 XgS949 

H28966 

AA382d14 AA402411 AM12355 
AI698839AI90S260AI9092S9 
Z78390 T974Z7 
Z45481 F12393T74437 
R05329 R015SR08276 
T82930R02424T85145 

AA3^14 T82938 AA327744 AW967388 AA639g67 710753 

T83263 T85731T85730 

T91772RO7257R07098 

H1Q818F07831Z43072 

N74613 198756 798589 

R09692 R09414AA346353 

>U863140 W60703 R43474 

HI 4957 R56S22R1 1908 

BE080180AVin27313AVV231970AA995028AA428584AVV872716AWa92508AVV^^ 

AAS28743AA552874AA564756AWD63245Al267534AVV070190AVV893483AA770330AAg06928AA9065B2AA7 

AW063311AA429538 

AW205447 AI248530 A1084433 AI400976 R165S3 
AA071267 765940 764515AA071334 
AAO18305H38325AA0O1221 
H79670 H47798AA7002B9 

N34524AA305071AW954803AAS02335Al433430AI2fl3597AW026670AW265323AVV850787A^ 

AW385512Al334S66W32d51 H82656 H53902 R88904AW835732 

AA769155 

N59537 N78278 R83560 
AA226431 AA226569 AA488748 
R91883Ai445591 



AA936248 

A1004024 

A1015615 

AI032589 

AI439473 

AI09246S 

AI476803 

A]500600 

Ai125111 

A1125152 

AI557041 

A1557135 

A1557246 

Ai7182g9 

A1719893 

AI745040 

A1760864 

Ai767143 

A1811109 



A1832332 
A1833240 



AI8707D4 

A1873111 

41689.1 

44060.1 

AA626876 

AA630128 

AA641012 

AA654070 

AA665955 

AW513315 

AVy^l546S 

AW516449 

AW516611 

AA670480 

AA700201 

AA724659 

AA737856 

29327J 



Af090405 AFO9O407 AF090406 
AF163305AF163307AF163303 



AFO8a654AF088656Ai=068655 
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305614 


AA782366 


305637 


AAS06124 


305639 


AAB06138 


305650 


AA807nS 


305690 


AA813477 


305728 


AAB282l09 


305759 


AA83S353 


3)5792 


AAB45256 


307041 


A)144243 


307091 


A1167439 


307181 


AI189251 


305901 


AAS72968 






307415 




307426 






AI97SA55 






307561 






AI'XlAOQA 




nlj 10^00 


307730 




307760 




307764 




307796 


AI350556 


309045 




309051 


A!91 1975 


307807 




307K)8 




307820 


Af35575t 


3)7852 


AI36S541 


309122 


I/O 






ouyif # 








309299 








0U9*t/0 






AW 191 1 la 






3(^769 


aW/979'lAft 


3M799 








302679 


0 1 i 090. 1 n03Uc£ /v\ 1 00009 


309923 


AViTMflfiflA 


309928 


AWTUIAIR 

M(V<^ It 10 




/tW09 1 00^ 






302705 


01iOO_l UU9UQU UU9UO 1 


302789 




JU'niUO 




304024 


T03036 




lUOlDU 


3)4028 


T03265 


304046 


T54803 


304061 


T61521 




T62S36 


302802 




304114 


P7flQAR 


304155 


noooso 


304203 


N56929 


304234 


wo luUO 


304348 


A&17QPRfl 
AAI /sOOO 


0U*Ht<3U 










/WKKlf ID 
















AAOOlMUl 




AAfl07l1A 


OUOUtM 


MH9U00ID 


306065 




306104 




306109 


AA911861 


306242 


AA932805 


306288 


AA938900 


306396 


AA97a223 


330568 


NOT.FOUND enUez U56244 


330599 


15323^-12 U90437 


331131 


genbank_R54797 R54797 


331203 


NOT.FOUNO entrez T82310 


331531 


genbank_N51343 N51343 


331547 


457396.1 AAB28597 N54811 


332074 


genbankJ\ASg9012 AAS99012 



PCTAJS02/12476 
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TABLE SCsiuMStiidfisnoniicposanafiorfiuisePk^ fVyeadiprafictedexao. we hara fisted thsgem^ 

sequence souns toed for predcSoa. NudeoSdeldcaSonsofeachpiedictedexansre^ 

Pkey: UnMiienumtefconespondSngtoan&sprobesel 

Sequence soims. The 7 digit mmibeis In ihhccta eiSQed 'The D^!A 

seqMenceofhuinanchroniQsame22.' OunharoL^aL, Nature {1999} 402:489495. 
Stracdb IfuScatssONA strand finotnwTfiBftexDnsiieerspiedi^^ 
M^jpositioQ: tmBc^asnucteoSdeposifionsof predkiBdflxoos. 



Ptey 


Ref Strand 






332792 


Dunham, LelaL 


PUjs 


73381-73768 


332816 


Dunham, LelaL 


Plus 


359844^030 


332S06 


Dunham, letaL 


plus 


1923101-1923205 


332911 


Dunham, Lelal 


Plus 


1361767-1951858 


332912 


Dunham, 1. etaL 


Pbis 


19621 20-1S62246 


332922 


Dunham, LelaL 


Plus 


2009S20-2009738 


332956 


Dunham. L elal. 


Plus 


2510528-2510658 


332959 


Dunham. LelaL 


Phis 


2518145-2518213 


333138 


Dunham, L elal 


Plus 


3369205-3369323 


333139 


Dunham. 1. elal 


Plus 


336949&-3369571 


333221 


Dunham. L elal 


Plus 


3978070^78187 


333330 


Dunham. L elal. 


Phjs 


4904775-4904846 


333387 


Ounhan^ LelaL 


Plus 


491O93&4910997 


333512 


Dunham.LeLaL 


Plus 


55605104360564 


333524 


Dunham. L elal. 


Phjs 


561262(^5612780 


333585 


Dunham. I elal 


Plus 


6234778^894 


333618 


Durlham, I elal 


Plus 


6562391-6562566 


333627 


Dunham. LelaL 


Plus 


6620584-6620903 


333628 


Ounham,LelaL 


Plus 


6629004-6629233 


333650 


Dunham, I elal 


Plus 


679685^6797128 


333678 


Dunham, L elal. 


Plus 


7068223-7068288 


333750 


Dunham, L et.ai. 


Plus 


7608165-7608234 


333763 


Ounhami I elal 


Pftjs 


7692491-7692530 


333767 


Dunham, Let.aL 


Plus 


7694407-7694623 


333768 


Dunham, Let.al. 


Plus 


7695440-7695697 


333769 


Dunham. Letal. 


Plus 


7696625-7696707 


333772 


Dunham. L elal. 


Plus 


7706773-7706902 


333777 


Dunham. L et.al 


Plus 


7746805-7746916 


333846 


Dunham. L elai. 


Plus 


8008623^08757 


333884 


Dunham, L elal. 


Plus 


8153960^154161 


333887 


Dunham, L elal. 


Plus 


8154882-8155025 


333891 


Dunham, L elal. 


Pius 


61S6437-8156709 


333692 


DunhamiLelal. 


Plus 


815682S81S7001 


333948 


Dunham, L elal. 


Plus 


8583497-8583627 


333954 


Dunham, L elal. 


Plus 


5563186-6563335 


333966 


Dunham, L elal. 


Plus 


8655643^5826 


333966 


Dunhanv LetaL 


Plus 


8681004W1241 


334061 


Dunham, L elal 


Plus 


9686941-9687077 


334094 


Dunham, I elal 


Phis 


9889953-9890105 


334113 


Dunham. LeUL 


Phis 


10282459-10282597 


334161 


Dunham. L elal 


Plus 


10599033-10599180 


334219 


Dunham. L elal. 


Plus 


12716160-12716384 


334239 


Dunham, Lelal. 


Plus 


13056569-13056693 


334333 


Dunham, L elal 


Plus 


13603544-13603657 


334378 


Dunham, L elal 


Plus 


13907239-13907370 


334382 


Dunham. L elal. 


Plus 


13915866-13916036 


334562 


Dunham. Lelal. 


Phjs 


14987647-14987940 


334588 


Dunham, L elal. 


Plus 


15032740-15032817 


334616 


Dunham, L elal. 


Plus 


15176123-15176470 


334633 


Dunham. L elal 


Plus 


15333206-15333305 


334866 


Dunham. LeUL 


Plus 


16872214-18872317 


334891 


Dunham, I elal. 


PhJS 


19299770-19299944 


334934 


Dunham, I elal. 


Plus 


20103970-20104058 


335015 


Dunham. Lelal. 


.PUjs 


20682792-20682945 


335120 


Dunham, L eUl 


Pbjs 


2l436286-2i43o384 


335125 


Dunham. LeUL 


Plus 


21441390-21441471 


335179 


Dunham, L elal. 


Plus 


21634405-21834526 


335188 


Dunham, !. etai. 


Phis 


21669118-21669328 


335211 


Dunham. 1. elai. 


Rus 


21774611-21774680 


335351 


Dunham. L elal 


Plus 


22807292-22807445 


335379 


Dunham. I et.aL 


Plus 


22899308-22899420 


335414 


Dunham, L et.al 


Pkis 


23235546-23235684 


335416 


Dunham, 1. elal. 


Rus 


23237354-23237465 


335496 


Dunham. LelaL 


Rus 


24154386-24164545 


335497 


Dunham. LelaL 


Plus 


24167666-24167869 


335558 


Dunham. I elal 


Rus 


24740167-24740347 


335586 


Dunham. L elal 


Plus 


24990333-249S0497 


335688 


Dunham, LelaL 


Rus 


25439839-25439920 


335764 


Dunham. L elal. 


Rus 


25942710-25942792 


335823 


Dunham, Lelal 


Rus 


2536592$-26366004 


335983 


Dunhanv LelaL 


Rus 


27938968-27939070 


335995 


Dunham, LelaL 


Phis 


28009044-28009184 


336021 


Dunhanv L^ 


Rus 


26686482-28686559 
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33G034 




Phis 


29014404-29014560 


3^038 


Ounham. L ei^L 


Phs 


29022963-29023165 


338107 


Dunhan. L eUL 


PhB 


29387731-29337869 


336632 


Ounham, L elaL 


Phis • 


9S3S^)-Sa5523 


336S33 


Dunham. LeLaL 


Pbs 


985591-^86221 


33SS34 


Oiinham.LeL^ 


Phs 


98S29&-988570 


336635 


Ounham.LetaL 


Plus 


987908-938384 


33663S - 


Dunham. L eLaL 


Plus 


988418^85 


336S37 


Dunham. L elaL 


Pbis 


989276-990813 


33S638 


Ounton.leLaL 


Pbs 


99190&-993240 


336659 


Dunham. LeUl. 


Pius 


1896402-1899478 


336694 


Dunham.LetaL 


Plus 


2420546-2420616 


336721 


OuflhamtLeLai. 


PhB 


337152^3371586 


33S900 


Dunham. L etal. 


Phis 


10235423-10236523 


336948 


Dunh3m.LeUL 


Pius 


12892290-12692381 


337028 


Dunh3m.leUL 


Phis 


16544817-16644942 


337034 


Ounham.LeL3L 


Phs 


17B21742-17821922 


337162 


Dunhanv i> ctaL 


Plus 


2347834323479145 


337183 


Dunham,!, eld. 


Pius 


23943605-23943586 


337184 


Dunham.LeU]. 


Phis 


23373949-23974016 


337268 


Dunham, L etal. 


Rus 


28011979-28012034 


3372S9 


OunhanxLeLaL 


Plus 


^02^29022775 


337389 


Ounham,Let^. 


Ptas 


31401S(&^1401579 


337493 


Dunham. L elaL 


Phis 


3333076&33330981 


337549 


Dunham, L etal. 


Pfus 


34474472-34474531 


337755 


Dunhanv, i. et.aL 


Phis 


3971764-3971900 


337809 


Dunham,L^.aL 


Phjs 


444S0694449193 


337871 


Dunham, L et.d. 


Phc 


5443027-5443101 


337958 


DunhambleUL 


Plus 


6969162-6969270 


338008 


Dunham, I.elaL 


Phis 


7697068-7697235 


338033 


Dunham, I etaL 


Plus 


8092128-8092271 


338110 


Dunham, I. elaL 


Plus 


10384481-10384521 


338112 


Dunham, L elal. 


Ptus 


10391398-10391600 


338145 


Dunham. I elal. 


Rus 


11386629-11386592 


338146 


Dunham, LelaL 


Pius 


11448985-11449085 


338179 


OunhanxLet^ 


Phis 


12808775-12808833 


338197 


Dunham. I elal. 


Rus 


13638107.13838181 


338279 


Dunham, L elal. 


Rus 


16168944-16169091 


338316 


Dunham, L elal. 


Phis 


17089711-17089983 


338322 


Dunham, I elal. 


Rus 


17132477-17132547 


338357 


Dunham, LelaL 


Phis 


18062184-18062402 


338359 


Dunham, L et^ 


Phis 


18074402-18074501 


338366 


Dunham, t. ela!. 


Rus 


18252026-18252189 


338374 


Dunham, L elal. 


Plus 


18371200-18371282 


338414 


Dunham. L elal 


Rus 


19345573-19345660 


338418 


Dunham, L elal. 


Plus 


19435506-19435596 


338501 


Dunham, elal. 


Rus 


21244713-21244828 


33BS06 


Dunham, L elal. 


Rus 


21221871-21221953 


338523 


Dunham, LelaL 


Rus 


21509763-21509864 


338662 


Dunham. L elal. 


Plus 


24404720-24404899 


338804 


Dunham, L elal. 


Rus 


27236005-27236108 


338836 


Dunham, L elaL 


Rus 


27792166-27792272 


336879 


Dunham, L elal. 


Rus 


28410653-28410734 


338937 


Dunham,LelaL 


Rus 


29160655-29160725 


338993 


Dunham, L ela). 


Pius 


30077787-30078184 


339047 


Dunham, L elal. 


Rus 


30760793-30760968 


339100 


Dunham. 1. elal. 


Rus 


31141580-31141765 


339114 


Dunham, LelaL 


Pius 


31456454-31456519 


339121 


Dunham. LelaL 


Pbis 


31583467-31583536 


339170 


Dunhan^ L elaL 


Rus 


32216399^2216527 


339293 


Dunham, LelaL 


Rus 


33223671-33223819 


332858 


Dunham. L elal. 


Minus 


1339607-1339397 


332982 


Dunham, L elal. 


Minus 


2628296-2628109 


332984 


Dunham, L elal. 


^tinus 


2532606-2532457 


332998 


Dunham, L et.a]. 


Minus 


2711704-2711565 


333058 


Dunhanv LelaL 


Mmis 


3028925^028811 


333097 


Dunham. I. elal 


Minus 


3204124-3204035 


333121 


Dunham. LelaL 


l^inus 


3308446-3308358 


333122 


Dunham. L elal. 


ft^huis 


3309595^09531 


333123 


Dunham, L ela). 


Minus 


3310817^10749 


333140 


Dunham. LelaL 


Minus 


3377220-3376309 


333260 


Ounham.LeUL 


Minus 


4308400^308304 


333603 


Dunham, L elaL 


Minus 


64663354465727 


333604 


Ounham. LelaL 


H^nus 


6457090-6466763 


333904 


Dunham, L elaL 


Minus 


8217374-8217261 


333906 


Dunhanv L elaL 


Minus 


8218236-8218063 


334183 


OunhatUi L eld. 


Minus 


11832582-11832506 


334187 


Dunham. LelaL 


Minus 


11921456-11921205 


334222 


Dunham, L elal. 


Minus 


12732417-12732289 


334223 


Dunh^ L elal. 


Minus 


12734355-12734269 


334255 


Dunham, L elaL 


Minus 


13200776-13200592 


334492 


Dunham. LelaL 


Minus 


14478333-14478172 


334648 


Dunham, L eld. 


Minus 


15363301-15363222 


334787 


Dunham, L elal 


Minus 


16299093-16298937 


334933 


Dunhanv L elal. 


Mlous 


20078117-20077991 
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334945 


Ounhanv L eLaL 


334967 


Dunham. LeLaL 


334990 


Duidianv L elsL 


335093 


OtinbaiR, L eiaL 


33S283 


OunharaLeLaL 


335289 


OunharaLd^ 


33S48 


Dunhan^LeLaL 


335551 


Ourdiam, L eLaL 


335519 


Dtudiam, I etai. 


335520 


Dunham, letal 


335521 


Dur4ham, I eUl. 


335682 


Dunhanv L eLa). 


335755 


Dunhan^leLaL 


33314 


Dunham^Lelal 


335815 


Dunham, I etaL 


33^35 


Dunhan^LeLaL 


33^1 


Dunham. 1. eUl. 


335868 


Dunhanx L eLa). 


335896 


Dunham. L etaL 


335936 


Dunham, L etaL 


335946 


Dunham)! etaL 


336066 


Dunham, L etaL 


336205 


Dunham, L etal. 


33S275 


Dunham, L etal. 


33S292 


Dunham, L etal. 


336331 


Ounhan^tetal 


336419 


Dunham, L et^aL 


336675 


0unh3n\ LetaL 


335684 


Dunham, 1. etaL 


336716 


Dunham. 1. et.dl. 


336798 


Dunham, L et.a{. 


337043 


Dunham, 1. eta!. 


337045 


Dunham, L etaL 


337128 


Dunham, 1. etaL 


337192 


Dunham, 1. etaL 


337194 


Dunhan^ L etal. 


337229 


Dunham, L eta). 


337325 


Dunham, L etal. 


337497 


Dunham, I etaL 


337500 


Dunham. 1. etal. 


337603 


Dunham. I etal. 


337505 


Dunham, 1. etal. 


337671 


Dunham, L etaL 


337786 


Dunham. L etaL 


337862 


Dunham, L etal. 


338083 


Dunham. I etal. 


338158 


Dunham, L etal. 


338161 


Dunham. L etal. 


338162 


Dunham. L etal. 


338189 


Dunham. L etal. 


338199 


Dunham, 1. etal 


338215 


Dunham, 1. etal 


338469 


Dunham, L etal. 


338549 


Dunham, I etal. 


338551 


Dunham, L etal. 


338671 


Dunham, LetaL 


338676 


Dunham, LetaL 


338726 


Dunham, Letat. 


338779 


Dunham. I etaL 


338871 


Dunham, 1. etal 


338872 


Dunham, L etal. 


338966 


Dunham. 1. etaL 


339229 


Dunham. L etaL 


339264 


Dunham. L etal. 


325228 


6381940 Plus 


325235 


6381943 Minus 


329588 


3962484 Plus 


329560 


3962491 Phis 


329541 


3983503 Mnus 


325328 


5855875 Rus 


325340 


6017033 Minus 


325373 


5866920 Minus 


325367 


5866920 Minus 


325389 


5666921 Pfajs 


32S436 


5866939 Minus 


325498 


5866967 Phis 


325471 


oui f\iyi Minus 


325557 


'605B302 Plus 


325559 


6249595 Mnus 


325560 


6249595 Mims 


325569 


6249599 Plus 


325587 


5682462 Phis 


325585 


6682462 Phis 


325597 


5866992 Plus 


325639 


5867002 Pius 



Minus 


ai losooo-zu loooo/ 


iUGnus 


ZU1 /JO i 7S 1 0 




/UM 1 J99-4IIO41U0/ 




clZ9/00/-'£l 29/214 


Minus 


223)4 27S22303770 


MtCUB 


^2305350-22305708 


NSnus 




Minus 


/4q79S2o-^4q7o9oi 


Minus 


£MC£Of l-^0U024bo 


Minus 


£3092301-25092404 


Mmus 




Minus 


25421 215-25421 


Minus 


23/ 0J<wO-2d/ OO/ 4/ 


Minus 


2oJ2U04i>-20j1 so45 


Minus 


^M2Ua1 o-2oj2u42 1 


Minus 


2039331 1-26393240 


Minus 


2Dw4oo3-2o504742 


Minus 


20/ 1 i43/-2t>/ 1 |3UU 


Minus 


26977639-26977558 


Minus 


Z/360474-2/jau400 


Minus 


27555324-27^5788 


Minus 


29241um>-^40o42 


Kfinus 


30477456-3047731 1 




■JcvovQ 1 y-viLyoojjO 


A£nus 


3201 6035^3281 7927 


Minus 


33594927^3594971 


Minus 


3W52S0O'340S2445 


Minus 


2u2U73O-202Doo4 


Minus 


21 5SU00-21 57993 


^finus 


j2a9952-3239oo2 


Minus 


900o954-5oo875/ 


Minus 


17407330-17407251 


Minus 


17610892-17610821 


Minus 


2221 525 1 -2221 5034 


Minus 


24591853-24591771 


Minus 


24610510-24610359 


Minus 


26716579-26716481 


Minus 


30015948-30015800 


Minus 


33371317^33371258 


Minus 


33376212-33376158 


Minus 


1299296-1299194 


Minus 


4^ Accce 4 4jle4A*y 

1346555-1346397 


Minus 


32oUd34-32o054/ 


Minus 


4133203-4133081 


Minus 


5347030-5347550 


Minus 


9318438-9318301 


Minus 


i^TOAACtO^HTOA'iA^ 

n /94403-1 1 794343 


Minus 


1212471&-12124558 


Minus 


i2o249i 9-1 2824827 


Minus 


12o/03S4-1287o47o 


Minus 


1 37oOBo5-137w7oO 


Minus 


4A/\ce*A'^ 4JH\CC3CC 

14055447-14055355 


Minus 


20520387-20520242 


Minus 


22049171-22049081 


Minus 


2231 19o6-2231 1855 


Minus 


24a0o421-245ua34o 


Minus 


24637427-24637369 


Minus 


25926206-2S92S618 


Mmis 


27030151-27029795 


K£nus 


28301708-28301611 


Minus 


28300921-28300790 


Minus 


29614876-29614749 


Minus 


32722330-32722199 


Minus 


3297514S^750S3 


2630-2694 




162154-162264 


1189-1619 




2095-2990 




2765^9 




86780^854 




166656-166819 


1136686-1136777 


922881-922958 


239672-2397i 


a9 


29778-29907 




173372-173930 


50921-51050 




11859^119172 


133794-133981 


79927^7 




126724-126967 


73476-73574 




1065020-1065089 


253525-253608 
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325739 


5867038 






325740 


5867038 




207533'207690 


325792 


6469S28 


hSstsis 


lUHrlifO 


3K735 


©52447 


fiSttas 




325685 


6682468 


Phis 


11 1/400 


32S68S 


6682468 


Ptus 


llO«wf*110*W9 


325819 




b£nus 




329764 


6048195 


Minus 


inoT'n.inQQnfl 


329703 


6065793 


Minus 


iJSlW-IWl JO 


329643 


6448539 


P!us 




329816 


^24888 




7Q25&-70UJ 


329860 


6687260 


Minus 


1 63474-1 63oK) 




S67087 


Phis 


224So-22oo3 


325895 


58S7097 


Plus 


3583l7-3w47o 


325925 


5867124 


Pte 


115749-115952 


325S32 


5867127 


Plus 


73©-7441 


325941 


5857133 


Minus 


64228*4402 


325969 


5857153 


Pius 


101911-102081 


325971 


5867153 


Plus 


10584 V106035 


329993 


4567166 


Minus 


101307-101434 


330020 


G671887 


Phis 


i7Z39/-i7249i 


328163 


5867168 


Minus 


7831-8035 


326274 


5867171 


Minus 


410289410404 


326025 


5867176 


Hus 


70854-70915 


326046 
328099 


5867182 
58G7166 


Minus 
Minus 


ciccQ done 
O2ao0-Q2o29 

ool3oi'^1oiil 


326108 


S867167 


Mnus 


23784-23903 


326165 


5867208 




62787-62929 


325189 


5657212 


Plus 


09ZOO'«941J 


326204 


5867216 


Minus 


l40U0>l4o2UU 


326230 


5867230 


Minus 


301 008^01972 


330052 


4567182 


Pus 


352550-352953 


330036 


6042048 


Plus 




326350 


5867293 


Phis 


4 4^djl4 

1 3627-1 3844 


326589 


5867320 


Rus 


22760-22919 


326393 


5867341 


Plus 


41702-41841 


326505 


5867435 


Minus 


8816-8949 


326515 


5867439 


Plus 


36663-36809 


326592 


6138928 


Plus 


23689-23^ 


330107 


6015249 


Minus 


J AAA A 4 dAAAA'k 

1(n)u9i-1u0282 


330106 


6015249 


Minus 


99443-99778 


330100 


6015253 


Pius 


21166-21301 


330093 


6015278 


Plus 


1043-1199 


330088 


6015293 


Plus 


37517-37638 


330085 


6015302 


Minus 


59613-59770 


330120 


6671864 


Minus 


127553-127655 


330123 


6571869 


l^nus 


Af A4 4 ^PJAf? 

3531 1-^406 


326742 


5867511 


Minus 


95187-95248 


326605 


5867637 


Plus 


2455&'24749 


326618 


6117831 


t^nus 


4 C4 AA 4C4An 

15199-15309 


326720 


6552456 


Plus 


84525-84677 


326770 


6598307 


Minus 


C4')cn^ etocco 


326692 


6682502 


Plus 


117oy7-117BS9 


326593 


6682502 


Minus 


335002-335095 


326963 


5867657 


Minus 


4CMO 4CC04 

16023-16581 


326991 




Plus 


1o14/-io9o9 


326936 


6004446 


ftifinus 


4(VMT IMC? 

10217-10357 


326964 


6469836 


Pfus 


75340-75456 


327040 


6531965 


Plus 


7o3o7U-7B3B17 


327053 


6531965 


Pius 


OAjITAC*T AAil^jl*Vl 

2247257-2247437 


327075 


6531965 


Pius 


4041318-4041431 


327085 


6531965 


Plus 


4734947-4735069 


327036 


6531965 


Rus 


A4AAf4 MAnAJA 

319951-320040 


327130 


6531976 


Pius 


AAA i Y AAA J A 

20247-22343 


327156 


5866841 


Minus 


2452-2^ 


327288 


5667481 


Rus 


48583-48773 


327332 


5867516 


lifinus 


S6361-S5532 


327220 


5867525 


Minus 


65701<65781 


327224 


5867534 


Pius 


1884^186544 


327321 


6249562 


Mnus 


99745-99836 


327361 


6552412 


A&uis 


61013-62130 


327396 


5867743 


Plus 


8702-8820 


327414 


586773) 


Rus 


102461-102588 


327442 


58677S 


Rus 


111483-111618 


327467 


5867772 


Rus 


8803088151 


327473 


5867775 


Rus 


75101-75181 


OCtHOO 


9o0/f OJ 


Pius 


I019/0-i0100£ 


327377 


5867793 


Minus 


37610^676 


327562 


5667804 


Minus 


343989-344474 


32^ 


5867811 


Minus 


48152-46287 


327606 


6004463 


Rus 


200262-200495 


327511 


5867858 


^Gnus 


175053-175392 


327642 


5867891 


Minus 


2513-2743 


327654 


5857910 


Minus 


9704-97710 


327734 


5887940 


A/Snus 


31003^1583 
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327775 


5867984 


Minus 


13079V13C371 


327798 


5867982 


Phs 


^^•85405 


327840 


6249578 


jyOnus 


73065-73206 


330203 


8)13599 


Plus 


66517^66931 


330263 


6671884 


iL^mus 


101503-101534 


323004 


5867933 


Mnus 


157407-157887 


328101 


5868020 


Plus 


289920-290014 


32S100 


58^20 


Mimis 


283545-263635 


328113 


5868024 


Minus 


80378-80491 


328157 


5868064 


Plus 


73326-73515 


328196 


5868080 


Ih£mi5 


16551-16729 


328197 


5868081 


Minus 


42133^438 


327940 


5868197 


lySmis 


9524095428 


327984 


5868216 


Plus 


66611-66677 


328021 


5902482 


Plus 


713478-714S90 


328068 


6117819 


Phis 


2S3903-254022 


326264 


6381912 


Plus 


55086^5404 


330300 


29058S2 


Minus 


3246^ 


326603 




Minus 


87770^953 


328600 


5868229 


Minus 


38889^10 


328616 


5668239 


Plus 


293920-294224 


326623 


5868246 


Minus 


120020-120126 


32B632 


58^47 


Plus 


76734-76853 


328866 


5868254 


Minus 




328698 


5868254 


Minus 


525555^533 


328700 


5868254 


Phis 


754089-754203 


328708 


5868271 


Minus 


68114^8854 


328735 


5668289 


Plus 


89389^5 


328743 


5868269 


Plus 


274638-274726 


328806 


5868324 


Plus 


29408-29884 


328299 


5668366 


Minus 


149708-149889 


328342 


5868383 


Phis 


599S&a}094 


328365 


58^7 


Minus 


270724-270798 


328369 


5868388 


Plus 


75371-75583 


328381 


5868392 


Phis 


662758462848 


3284S1 


S88B425 


Minus 


217275-217336 


328481 


5866449 


Minus 


8987-9180 


328500 


5868464 


Plus 


5909W9481 


328530 


5868482 


PhJS 


33497W35406 


328664 


6004473 


Plus 


1193739-1193866 


328861 


6381928 


Minus 


108317-108403 


328908 


5868493 


Rus 


117002-117059 


328933 


5868500 


Plus 


771755-771889 


328934 


5668500 ' 


Plus 


84€342-84544d 


328949 


6456755 


Minus 


43552-43619 


330313 


6042030 


Minus 


33642-33775 


329005 


5858542 


Plus 


85470-85673 


330366 


2944106 


Plus 


151837-151914 


330372 


6580495 


Minus 


317461-317688 


329033 


5668561 


Minus 


5390-5479 


329037 


5866562 


Minus 


32466-32562 


329067 


5866591 


Minus 


145417-147652 


329134 


5868679 


Plus 


29959^18 


329157 


5868687 


Minus 


145S40-146155 


329178 


5868704 


Plus 


179177-179463 


329192 


5868716 


Plus 


166936-167020 


329194 


6868716 


Minus 


304450-304559 


329204 


5668720 


Minus 


3050-3190 


329224 


5866728 


Plus 


27422-27664 


329228 


5866728 


Minus 


50118-50287 


329268 


5868771 


Plus 


25554-26299 


329337 


5868806 


Minus 


467155467222 


329011 


6682532 


Plus 


4805&48741 
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TABIISA: PotenfialTherapeiiSc.OiaginsScaadno^^ 

Tabls 9Asho8a about 1312 genes up-{Bgu!3£ed m bing tumss 

tiimots)reiaSvek)Qornia] body tissues. These genes were selactsd from abod5S6^(sot)es8t5 on OieEosMQr^^ 

Tabte 93 show tits accessfflonuaitos tor those Pte/slacldngUn^^ For each prohesetw hare Bstedfiiegeradusto number to wia^lte 

oSgonudeotides were designed. Gene chjstersvmecofflpiWusng sequences demed to Genba^ These sequences were ckjstered based oo sequence 

simaarity using Chistering and Afigrenenl Tools (DotiteTwisl OaMand CaBftvma}. The Genbankaooessioo numbeis far sequences coinpnsing each duster are listed in Qw 
'Accession' oobmn. 

Td)te9Cshwlhe genomic posifioringfof those Pka/s lacking Unl^ For each pradiciedexon. we haw listed fiaQeaonac 

sequence source used forpradicfin. Kucteotide toca!ions of each predicted exon are also Gsted. 



15 



20 



Pkey: Unique Eos probesetidentiiier number 

^xAacK Exemplar Accession number, Genbank accession number 

Ui^genelD: Unigene number 

UnlgeneTi&e: Unigene gene I31e 

R1: Averageaflungteniors(lnctudin9squaaiousce9carcinainas,adeflocarcinoraas,sn^ 

average of noinial lung samples 

R2: A«efageofnon4naBsnantbngdbeasesanip!es(lncMiiqt^^ 
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Pksy 


ExAocn 


UiugenelD 


Unigene Title 


R1 


R2 


400195 




NKL007057^Homo sapiens 2W10 interador 


1.00 


1.00 


400205 






NM.00626S*'iiomo sapiens RAD21 (S. pomba) 


15.80 


396.00 


400220 






EosContid 


2.28 


Z84 


400277 






Eos Control 


7.68 


a72 


400285 






Eos Control 


1.00 


1.00 


4002B8 


xas255 


Ks.149609 


Integria alpha 5 (fibronectln receptor, 


1.04 


2.24 


400289 


XD7820 


H&22S8 


matrix metaUopnsteiRase 10 (stoelysin 


132.45 


4X0 


400298 


AADQ2279 


Ks.61635 


six transmembrane epithelial anfigen of 


4186 


74.00 


400301 


X03635 


Hs.1657 


estrogen receptor 1 


1.00 


1.00 


400303 


AA242758 


Hs.79136 


LIV-1 protein, estrogen regidated 


1.75 


1.65 


400328 


X87344 


Hs.180062 


transporter 2. ATP-tMig cassettdi sUb 


0.87 


1X0 


400419 


AF084545 




Target 


156.55 


25100 


400S12 






NMl030S78^Hanu> sapiens cytechronie P45D. 


1.00 


ZOO 


400517 
400560 


AF242388 




lengsln 

NM^OSOS?^ Jlomo sapiens cytochrome P450, 


167 
1.00 


87.00 
1.00 


400664 






NM^002425:Hcmo sapiens matrix metaltopro 


2)i6 


45.00 


400665 
400656 






NM^002425'i{omo sapiens matrix metallopro 
NM.002425.'Honio sapiens matrbc metaltopio 


1.36 
126 


1,07 
122 


400749 






NM.003105*i1omo sapiens sortHin^elated 


1.00 


91.00 


400763 






Target Exon 


7.63 


24.00 


401027 






Target Exon 


1.00 


1.00 


401093 






C12000586-:gi|63301671dbilBAA86477,ll (A 


1.00 


155.00 


401203 






Target Exon 


1.00 


86.00 


401212 






CI 2000457^glj751 21 78IpirIiT30337 polypf 


1.00 


40aQ0 


401411 






ENSP00000247172'ifrPOTHETICAL 126.2 kDa 


1.00 


72.00 


401435 






C14000397*:glI7499898(ptrff733295 hypolh 


1.00 


64.00 


401464 


AF039241 




histonedeacdylaseS 


182 


49X0 


401714 






ENSP000Q0241807:CONA FU11007 RS. ClON 


102 


40.00 


401747 






Homo sapiens keratin 17 (KRT17) 


128.43 


68.00 


401760 






Target Exon 


1.74 


35.00 


401780 






NM.005557*itono sapiens keratin 16 (foca 


26.47 


10.50 


401781 






Target Exon 




4.01 


401785 






NM_002275'i<omo sapiens keratin 15 (KRT1 


4.13 


270 


401797 






Target Exon 


1.4* 


210 


401961 






NM_021626:Komo sapiens serine carboxypep 


1.41 


1.88 


401985 


AF053004 




dassl cytokine receptor 


1.00 


177X0 


401994 






Target Exon 


61.84 


47.00 


402075 






ENSP0000O251056*:Plasma membrane calcium 


1.00 


1.00 


402260 






NM_001436*:Homo sapiens fibiiOaitn (FBL 


1.58 


1.39 


402265 






Target Exon 


209 


35.00 


402297 






Target Exon 


1.00 


9200 


402408 






NM_030920*ilomo sapiens hypothetical pro 


28.87 


13.00 


402420 






C1000823':9ii10432400{embIGAC1029aiI (A 


1.00 


1.44 


402674 






Target Exon 


7.44 


24100 


402802 






NMlOD1397:Homo sapiens endolheHn conver 


1.00 


70.00 


402994 






NM 002463*i1onio sapiens myxovinis {influ 


1.37 


1.43 


403137 






NM_005381*:Komo sa;^ nudeoSn (NCL), 


1.00 


19.00 


403306 


NM.006825 




transmembrane protein (63kO}. endoplasmi 


1.00 


eoo 


403329 






Target Exon 


1X0 


61.00 


403381 






ENSP00000231844*:Eootropic virus inlegia 
NM^022342:Komo s^pens kinestn piotetn 9 


1X0 


119.00 


403476 






28.13 


136.00 


403485 






C30018iy:gi|12737279|ref|XP.012163.1| k 


20.23 


76.00 


403627 






Target Exon 


&30 


29.33 


403715 






Target Exon 


1.30 


35.00 


404044 






ENSP00fl00237855':DJ398G12 (NOVEL PROTEI 


1.00 


54.00 


404076 






NM„016020^Honio sapiens CGI-75 protein ( 


14.29 


91X0 


404101 






C8000950:giI423560IpirtlA47318 RNAW 


1.00 


1.00 


404140 






NML006510:Komo sapiens ret finger protei 


1.42 


1.44 


404165 






ENSP00000244S62:KRH dehydrogeriase (quino 


1.00 


54.00 


404185 






Target Eran 


1.00 


117.00 


404210 






NVL005936:Hamo sapiens myetoMAymphoid 


5i93 


1177 


404253 






N(yLQ21058*:Homo sapiens H2B histone fami 


1X0 


1.00 
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404287 
4042S8 
404347 
404440 
404721 

404794 HSAJXiCm 

404854 

404877 

404327 

404986 

40S449 

4C5568 

405572 

405646 

405576 BE33S714 

405770 

405932 

4Q6137 

406360 



PCTAJS02/12476 



40S467 

406621 X57809 
406642 M245210 



406671 
406673 
406676 



406690 
406698 
406815 
406851 
406964 
406967 
405974 
407103 
407128 
407137 
407168 
407239 
407242 
407244 
407289 
407300 
407366 
407378 
407430 
407453 
407577 
407634 
407710 
407720 
407746 
407756 
407758 
407782 
407788 
407790 
407811 
407839 
407944 
408000 
408031 
40»}83 
408070 
408101 
408122 
408212 
408243 
406349 
408353 
408354 
408369 



408482 
408522 



408545 
408572 
408833 



408761 
408771 



AA129547 

M34996 

X58a99 

U7re34 

M18728 

M31126 

A429540 

X03068 

AA833930 

AA6097&4 

Iyl21305 

M24349 

M57293 

AM248B1 

R83312 

T97307 

R45175 

M076350 

M18728 

M10014 

M135159 

AA102616 

AF026942 

AA299264 

AF169351 

AJ132087 

AW131324 

AW016S69 

AW022727 

AB037776 

AK001962 

AA1 16021 

D50915 

AA608956 

BE514982 

AK)27274 

AW190902 

AA045144 

R34008 

L11690 

AA081395 

8E086548 

AW1488S2 

AW968504 

A1432652 

AA297567 

Y00787 

BE546947 

QE439638 

A1382803 

R38438 

AF1230S0 

NM.000676 

AI541214 

AW381532 

AW235405 

AA055611 

AW963372 

AA52S775 

AA057264 

AW732S73 



H$.18112S 

tfe.2S344t 
Hs^54 
Hs.198253 
HSX1221 



Ka^72822 
Hs^29 
Hs.73931 
Hs.288038 



Hs.256301 
Hs^7260 

HS.1171B3 
Hs.67846 

Hs.75431 

Ks.203349 

Hs.1207e9 

Hs.271530 

Hs.57776 



Hs^46759 
Hs.136414 
Hs.23616 
HS.38Q02 

Hs.38260 

HSJ8365 

Hs.1 12619 

Hs^l 

Hs.288941 

Hs,40098 

Hs.161566 

Hs.239727 

H&620 

Hl42t73 

Hs.42346 

Hi123073 

Hs.42a24 
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449167 T05095 HS.1SS97 KIAA1694pral^ 

449207 AIJ044222 Hs,23255 nucfeoponn 155kO 

•449223 AJ403107 H&148590 protaiofelstBdwiS) psoriasis 

449230 6E613348 H&211579 mslanornaceOadtesionindfiade 

S AiS38293 girGI^J07ji1 UajXSAPjGC6lfoniosa]:sens 

443313 AW236021 Hs.78531 Honios^Kas,Snaarto«xaid»lAS73D 

449448 060730 Hs^471 ESTs 

449467 AWZOSta* Vfe.197042 ESTs 

449523 NiyL000579 Hs.54443 djemokEn8{(>Cinof2)feoeptorS 

10 449722 BE230074 HS.23S60 cycfinBI 

449976 H06350 Hs.135056 Human DNA sequence trom done RP5-850E9 

450001 NM_001044 Hs.406 solute canierfemiV 6 &«iifol^^ 

450093 W27249 Hs^109 bypoihsficalprateinFU21080 

450101 AVW9989 Hs^4385 Human hbc647mRKA sequence 

15 450149 AW96g781 HS.1328S3 Zfc tanfiy member 2 (oddfaiied DrosopM 

450193 AI916071 Hs.15607 Homo sapiens Fairaii anemia cornplemental 

450221 AA3281Q2 Hs^4641 cytoskdelon associated pn)t8in 2 

450372 BE218107 H5:202436 ESTs 

450375 AA009647 HS.88S0 a<fisfntegtinandnetaaoprateu^ 

20 450447 AF212223 Hs.25O10 hypolhelicaipn)i8!nP15-2 

450568 AU350078 Hs^5159 HoflJOsaiiienscONARJ10784fis.ctoneWT 

450589 AI701S05 H&2Q2S26 ESTs 

450684 AA8726G5 Hs 25333 In!arleiAIn 1 receptor, type U 

450701 H39960 HsJfi8467 Homo sapiens eOKARJ122a0&s. done MA 

25 450705 U903M Hs.25351 iroquofe horneobox protein 2A |1RX-2A)( 

450832 AA401369 Hs.190721 ESTs 

450937 R49131 Hs.26267 ATP-dependanJ interferon response prrtd 

450963 AA305384 Hs.2S740 ER01 (S. cereyIsiaeHn» 

451105 A!761324 sbM^Oblt-xINO-CG^-CoieHonio sapiens 

30 451110 A)955040 Hs.265398 ESTs, WoaMy similar to transfoimaeoiw 

451253 H48299 Hs.26126 dautfinlQ 

451291 R39288 Hs.6702 ESTs 

451320 AW498974 dtacylglycerol kinase, zela {104kD) 

451380 HD92B0 Hs.13234 ESTs 

35 451386 AB029008 Hs.26334 spastic paraplegia 4 (autosomal dominant 

451437 H24143 Hs.31945 tiypottiBtical protein HJ1 1071 

451462 AK000367 Hs.26434 hypothetical protein FLJ20360 

451524 AK001455 Hs.25515 hypothefical protein FU10604 

451541 BE279383 Hs.26557 platophHtoS 

40 451592 AI805416 Hs^13897 ESTs 

'451635 AA018899 Ks.127179 crypfegene 

451743 AA401369 Hs.J9072} ESTs 

451808 N^L003729 Hs^7075 RNA ?-tfifli«nal ptwsptate cydase 

451607 W52&54 hypotheSca) protein FU23293 similar to 

45 451871 AI821005 Hs.1 18599 ESTs 

451952 AL120173 Hs^1663 ESTs 

452012 AA307703 Hs.279766 kln8sln«amBymeml»r4A 

452045 AB018345 Hs^7657 KIAA0802 pfoteln 

452194 AI694413 Hs.332&49 oltadory receptor, famiiy 2. subfamDy 

50 452206 AW340281 Hs^74 Homo sapiens, done IMAfi£:36Q6519, mRNA. 

452240 AA401369 Hs.19a721 ESTs 

452256 AK000933 Hs.28661 Homo sapiens cDNA Ft) 10071 fo, done HE 

452281 T93500 Hs.28792 Homo sapiens CONAFU11041 lis. done PL 

452291 AF015592 Hs^8853 CDC7 (cell division cyde 7. Sceievisi 

55 452295 BE379936 Hs^66 programmed ceD death 10 

452304 AA025386 Hs.61311 ESTs. WeaWy similar to S10590 cysteine 

452340 N1^002202 Hs.505 ISL1 transcription factor. UMffiomeodoma 

452349 AB028944 Hs.29189 ATPase. QassVi.typellA 

452367 U712)7 Hs.29279 eyesat}sent(OR}sa|>h}l^homotog2 

60 452401 KWL007115 Hs^2 tumor necfl»i5tactof,i|ptB^nduced pre 

452410 AL1336t9 Homo sapiens mRNA; cONA OXFZp434E2321 (f 

452461 N78223 Hs.l08106 transcription fiador 

452571 VU31518 Hs^685 ESTs 

452613 AA461S99 Ks.23459 ESTs 

65 452699 A\nQ95390 Hs^13062 ESTs 

452705 H49805 Hs^46005 ESTs 

452747 AFt60477 Hs£1460 IgsupeftamRy receptor LMR 

452787 AW294022 Hs,222707 K1AA1718 protan 

452795 AW3925SS Hs.16878 hypothetical protein FLJ21620 

70 452823 A6012124 Hs^0696 transcription fador4ite 5 (basic heSx 

452833 BE559681 Hs^736 KIAAOl 24 protein 

452838 U85011 Hs^743 preferenSaSy expressed aiSgen In mela 

452862 AA4D1369 Hs.190721 ESTs 

452865 AW173720 Hs.345805 ESTs. Weakly silrilar to A475B2&ce8gr 

75 452934 AA581322 Hs.4213 bypotheticd protein MGC1 6207 

452946 X9S425 Hs^1092 EphAS 

452976 R44214 Hs.101189 ESTs 

453028 AB0Q6532 HS.31442 RecQprDt^n^4 

453095 AW29S650 Hs.252756 ESTs 

80 453102 NM_007197 Hs.31564 frizzled (Drosophil3)homaio9 10 

453103 A1301052 Hs.153444 ESTs 

453120 AA292891 HsJ1773 pregnanpy4nduced growth inhibitor 

4531S3 N53893 H^^4360 ESTs 

453160 AI263307 H&23g884 H2Bbistonefanfly. member L 

85 453197 AI916269 HSwt09057 ESTs. Weddy similar to AliiS.Ha'MAN ALUS 
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Table 108 show Bie accession numbers for those fWs bcfdng Un^anelO's for bbte 1 0A. For each probes^ «e hsve listed Ojc gene duster number f^^mvAM^e 
Su^Swereclesigned. Gene dusters wereSompiJed using sequent ^'^''^'^'^^if^^.f^r^ 
SS^^a^tfirin^ "^"^ numbers for sequences compasmg each duster are fisted « the 

'Aooesston'columR. 

TrfitolOCshowthegenoracpodioriingfor^ 
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0.77 


2.47 


1.x 


74X0 


0-85 


1.96 


1.18 


2.56 


1.x 


76.00 


1.x 


63.x 


0.x 


2.19 


0.97 


1.^ 


1.x 


1X.X 


1.x 


0U.UU 


0.09 


2.S5 


1.x 


98.00 


1.x 




1.x 


52.x 


1.x 


13ZX 


OlII 


15.x 


1.x 


103.00 


0.x 


6.96 


1.x 


70.x 


1.x 


x.oo 


0.^ 


1.84 


1.x 


79.x 


0.91 


l.or 


0.x 


4 M 

ZM 


1.x 


7b.uU 


046 


1.46 


079 


2.25 


1.93 




0.04 


579 


1.x 


167.x 


0.04 


3.10 


1.x 


91.00 


120.16 


315.x 


ax 


1.84 


1.x 


128.x 


1.x 


108.x 


1.x 


91.00 


IjX 


87.x 


1.x 


105.x 


1.x 


71.00 


1.x 


115.x 


1.x 


X.00 


0.30 


310 


1.00 


77.x 


IjX 


85.00 


1.x 


OZ.UU 


Ikfo 


1.89 


1.x 


75.x 


a78 


5.83 


ax 


1068 


1.x 


70.x 


1.x 


197.x 


1.x 


25ax 


0.55 


2.x 


1.x 


9ax 
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444515 


AW20490O 


U- < COP TP 


445763 


A(741471 


t&23G6S 


445903 


R13S80 


Hs.13438 


446291 


6£397753 


Hs. 14623 


445917 


A1347883 


Hs.1 56672 


447251 




Ue 17017 


447432 






447482 






447^7 


K0Q^6 




448299 






448782 


At fic/ymsL 




45057S 


kill nncoco 

NA/|JQ0ao59 


rts./ai 1 < 


450584 


AA040403 


nS.oUo/1 


45(K93 






450715 


AI266484 


Hs.31570 


451103 


RS2804 


H&2S956 


451220 


Ar124251 


Ks.26054 


451668 


Z43948 


IJ> 90C4ilA 


452197 


AVklQ23595 


HS.Z32u4o 


452331 


AAS98509 


Hs.29117 


452353 


C16825 


n&2Si9i 


4530^ 


BE537217 


Hs.30343 


453107 






453355 


AW295374 


Hs,31412 


453390 


AA882496 


HSJ284B2 


453531 


AA417940 




454741 


B£154336 




456579 


AA287827 


Ks^205 


456672 


AKD02016 


Hs.1 14727 


457400 


AF032g06 


HSJS2549 


457718 


F18572 


Ks.22978 


4596S5 


fmzr 





ESTs 
ESTs 

Homo sa^ns done 24425 niRNA sequence 
ktierfemn, gaTsn^Hrulixabte protein 3^ 
ESTs 

exbocalhdarGilk domsaconlaining 1 
midix (nucleoside diphosphate Gnted moi 
K]AA1233pn3tesi 

ESTs. Weakly siniSar to 138022 hypoSiefi 
hjpoiiiefical pratan FLJ10392 
KIAA0758 protein 

purine^ eiameamndiBg pra^ A 

ESTs 

ESTs 

ESTs. WeaUy sioiilar to K1AA1324 pratehi 

DKFZP564O206p(Oteui 

novel SH2-cQnta^ prat^ 3 

catDageacKficprolflinI 

ESTs 

purine^ etement Untfing proton A 
epiOt^ membrane protein 2 
ESTs 

van£totd receptor-Qke protein 1 

Homo sapiens cDNA FLJ11422 lis, done HE 

ESTs 

ESTs, WeaMy amilaf to JC5795 CDEPprot 

gb:CM2-HT0342^299-05W)05 HT0342 Homo 

(qHegutatedbyBC&CWS 

Homo sapiens, done MGC:16327, mRNA. com 

cathepsInZ 

ESTs. Weakly similar lo AUM^HUMAN AUJ S 
gb:HSC 1KA072 normaliffid inCant brain cOM 



1.00 


64.00 


aQ2 


4J38 


1.00 


97i)0 


0.93 


1.69 


1.00 


106l00 


0.40 


47.20 


1.00 


loaoo 


0.05 


8.21 


0.02 


5.42 


IjOO 


79J)0 


042 


1.56 


0.17 


11.33 


1X10 


94.00 


1.00 


91.00 


1X0 


15Z00 


1.00 


66.00 


0.60 


1.30 


054 


1.91 


1.00 


67.00 


4.53 


11.07 


0.72 


2.24 


1.00 


68.00 


0.83 


1.70 


1.00 


13Z00 


1.00 


72.00 


1.00 


68X0 


0.57 


2.89 


1.00 


8100 


D.79 


196 


1.03 


3.25 


1,00 


113.00 


1.00 


544.00 



TABLE 10B 

Pltey: Unique Eos probeset identifier number 
CAT number Gene cluster number 
Aocessiion: Genbankaocessionnumbeis 



Pkay 
408074 

411667 
413533 

423387 



ST ^'5Iw63003AA333976AA334725AA334151AW3654gOAA310513AIB1K^ 
C06094AW104S34 

BE 46797 BE146776 BEM638S BEM6793 BE146768 BE146771 8Et469S4 BE146760 BE147(M8 BE147025 BE147030 
22779 1 S74U110OTL13M8X^^ 

423S36 231,2.1 ^^^.^^^''^l^S^^Sm^ 

^ A1864375AI206100AA912444AI269365AI640254AW772465AI867336AAe27e04H16914AA358477AA^ 

430212 314437 1 AA469153A1718503AA469225 

436532 421802L1 AA721522AW975443T93070 

453531 97026 1 AA417S4DAA036735T07025 

454741 1232S69J BE154396AW817959BE154393 



TABIEIOC 

Ptey: 
Ref: 

Strand: 
MLpoation: 

Pkey 

400754 

401045 

401083 

402474 

40280B 

403021 

403421 

403438 

4036S7 

403764 

404277 

404288 

404394 

404518 

404916 

405105 

405257 

405381 



^rc^nTTheTSa^^^^^ 

sequence of human chromosome 22." Dunham I. et d.. Nature (1999) 402:489495. 
indicates DMA strand from which exons were predicted. 
Indicates nudeoiide positions of predicted exons. 



Ref 


Strand 




7331445 


Plus 


144559-144684 


8117619 


Phis 


90044-90m91111-91345 


3242744 


Rus 


33192-33360 


7547175 


Minus 


53526-53628,55755«920^53O57757 ^ ^^^^ 


6456148 


Minus 


114984-115136.115461-115585.115931-116047.117666-117771.118004-118102 


7547270 


Plus 


120799-120966 


9665041 


Minus 


126609-126773.139986-140205 


S719679 


Phis 


90792-90938 


7387384 


Pbs 


9009-9534 


7717105 


K^s 


116692-118853 


1834458 


Minus 


91665-91945 


2769644 


Plus 


3512-3691 


3135305 


Minus 


37121-37205,3749147762.4105341140,4132241593.4177341919 


8151988 


Plus 


84494-84503 


7341826 


Plus 


91057-91188 


8079395 


Minus 


80877-81418 


7329310 


Phis 


73121-73273 


6006920 


Minus 


763S-80S4 
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TABLE llA: Genes Obfii^udmgAdenocatrinoraa to 

Tabtemsto«saI)o»i84senesupregutetedto!^ Thase genes weraseJeded 

fim abort 59o80 prabesets on ihe Eos/Afiyme!nx Hi^ 

T«w«tiochA«frftac«»«««n!imhefsfate^ Fof eacfa probeset vre have Ssted the gfifw chistEf mmiber fto^ 

•Aocessim'aftjiiB. 

T^llCshawfteganonfcposfflcningfefttio^ fcreaApieiSdedea^vBtandsisd^ie^pmic 
sequence source used ferpredcfioiL NucteotkJebcafioosofeachptedktedexM 

PkBf Unique Eos pfiAesct identifier raanber 

ExAoac ExOTpiar Accession numbef.GeribaricaDcesarana^^ 
UnigenelO: Uicgene number 
UiaoeneltQe: UragenegeneliSld 

Rt: Average of tonglunmfindudingsqjanwus eel can±^ 

average of nonnal (ung samples 
Average of non-maiignant lung disease Si 



R2: 

Plcey 

403329 

406399 

405690 

407869 

407881 



smaS cell carunonias, granulomatous and carcinoid tumors) divided by the 

Romia) lung samples 



ExAccn UntgenelO 



409103 
409187 



41X76 
410102 
410399 
411908 
412612 
414075 
416208 
417542 
419163 
419502 
419631 
420931 
421155 
421190 
421474 
421515 
421582 
422026 
422095 
422311 
422867 
423472 
423SS4 
424502 
424544 
424905 
424960 
425523 
426230 
427701 
428585 
428758 
429170 
423263 
429810 
430508 
430985 
431548 
431666 
431986 
432375 
43^77 
433556 
433819 
434001 
434424 
434792 
436217 
436749 
436972 
437886 
437935 
438915 
439451 



fjI29540 

M827976 

AW072003 

BH296227 

AF251237 

AF154830 

AA576S53 

T05387 

AW24S50a 



1^7943 

NM_000047 

U11862 

AW291168 

J04129 



AU076704 

AW188117 

AF044197 

H87879 

U95031 

U76362 

Y11339 

A1910275 

U80738 

AJ868872 

AF073515 

L32137 

AF041260 

M90516 

AF242388 

MB8700 

NM-002497 

BE245380 

AB0O7948 

AA367019 

AA411101 

AB007e63 

AA43398& 

NM,001394 

AA019004 

Aa024937 

AI015435 

AA490232 

AI834273 

AF176012 

AA536130 



NM_004482 
W56321 

Av;raii097 

AW950905 

AI811202 

AA649253 

T53925 

AA584890 

AA234679 

AA156781 

AyV339591 

AA280174 

AP08827O 



Hs^29 

Hsi439V 

Hs.40968 

HS.2S0822 

Ks.112208 

Hs.50966 

Hs^72 

Hs,7991 

Hs.279727 

H172924 
Hs.74131 
Hs.75741 
Hs,41295 
Hs.82269 
Hs.89663 

Hs^1S4 
Hs.100431 
Hs.102267 
Hs.102482 
Hs.104637 
Hs.105352 

Hs.110826 

Hs.282804 

Hs.114948 

Hs.1584 

Hs.129057 

Hs.1674 

Hs.149585 

Hs.150403 

Hs,153704 

Hs.153952 

Hs.158244 

Hs^41395 

Hs,243886 

Hs.185140 

Hs.98502 

Hs.2359 

Hs.198396 

Hs.211092 

Hs.104637 

Hs.27323 

Hs.9711 

Hs^0720 

Hs.149018 

Hs^62 

Hs.278611 

Hs.111460 

Hs,1 12765 

Hs.3697 

Hs.325335 

Hs.132456 

Hs.107 

Hs.5302 

Hs.25640 

Hs5S40 

Hs^8S681 

Hs;Z78554 



UnigeneTgle 
TaigetExoo 

NMj003122^Homo sapiens serine protease 
cafonoembryonb antig&Halaied ceS ad 
hypoth^ protein FU13612 
heparan sulfate (glucosanune} 30^M 
5erineAhn3onlne kinase 15 
XAGE-1 protein 

carbamoyifhosphaie synthetase 1. mitoch 

hypolheficst pnisin FU13352 

ESTs 

Homo sapiens cONA RJ14Q35 lis. done HE 
synudein, gamma (breast caicer-spedEc 
cyticSne deaminase 



amiloride binding ptutein 1 (amine oxida 
ESTs. Weakly similar to MUC2_HUMAN MUON 
progestsQen-associated endometrial prote 
cytochrome P450, subfamily XXW (vitanm 
fibrinogen. A alpha polype;&le 
popeye protein 3 

small inducible cytoJdne B subfamily (Cy 
lysyl oxidase 

mucin 5. subtype B. tracheobronchial 
sduto carrier famSy 1 (glutamate trans 
GalNAc alpha-2, 6-sialyitrans!iarase \, \ 
trefoil factor 1 ^roasi cancer, estroge 
trinudeo^ repeat contdiniDg 9 
hypotheOcal protein FU22704 
cytokine receptor-like factor 1 
(^age ofrgomoftc m^ protein (pse 
breast carcinoma amplified sequence 1 
gltJMne-1iniCtQS&6-phosphatB Uansanun 



dopa decarboxylase (aromalic L-amno ad 
NIMA (never in mitosis gene a^felated k 
5* nucleotidase (C073) 
K1AA0479 protein 
protease, serine. 1 (t^psin 1) 
nudear autoanligenic spemi protein (his 
KIAA0403protoin 
hypothetical protein FU14303 
dual specifidty phosphatase 4 
ATP-blnding cassette. suWamJIy A (ABC1 
LUNX protein; aUNC (palate lung and nas 
ESTs 

ESTs. Weakly similar to 178885 serineflh 
novo! protein 

J domain containing protein 1 

Novel human gene mapping to chomosome 20 

S100 cabaum-bindng prateoi P 

UDP-N-acetyMpha-fHaladosanunepdlyp 

cateium/catmodulin^pendent protein kin 

ESTs 

serine (or cysteine) proteinase inhlbito 
Homo sapiens d)NA: FU23523 &s. done L 
ESTs 

Mi09en4ik8l 

lectin, galadoside-binding. sduble, 4 
daudin3 

metaHothtonein 1E(func£onaI} 
mucin 13. epitheSal transmembrane 
WilSains-Beursn syndrome duomosome regi 
hetecochnvnaSivlike protein 1 



R1 


R2 


1.00 


61.00 


1.00 


39.00 


22a37 


350.00 


a77 


1.18 


1.00 


10.00 


7.76 


1.00 


80.44 


40.00 


1.00 


1.00 


1X0 


1.00 


1.12 


1.50 


9.89 


1.00 


a92 


1.06 


1.00 


1.00 


1.02 


1.03 


0.84 


1.07 


3.67 


1.00 


1.28 


1.35 


1.00 


too 


13.05 


115.00 


1.00 


13.00 


1.00 


aoo 


m 


15.00 


1.17 


1.55 


1.46 


1.76 


1.00 


3.00 


1.23 


1.00 


1.00 


52.00 


4.37 


Z34 


1.15 


1.78 


1.69 


ai7 


48.13 


72.00 


1.00 


50.00 


1.00 


1.00 


1X0 


59.00 


21^ 


1.00 


1.00 


1.00 


1.00 


35.00 


1.00 


83.00 


7.41 


'34.00 


1.00 


&00 


1.06 


1.13 


16.18 


105.00 


1.07 


1.00 


1.59 


1.69 


4.75 


7.27 


0.94 


1.28 


5.66 


15.00 


49,76 


37.00 


1.19 


1.47 


1.65 


1.05 


1.00 


48.00 


1.00 


19.00 


3.71 


aoo 


29J1 


72.00 


VOO 


84.00 


as2 


44.00 


57.97 


31.00 


1.10 


1.41 


1.59 


1.48 


3.62 


101.00 


1.60 


1.39 


1.00 


1.00 


23.28 


SZOO 
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439759 
441031 
441377 
443614 
443813 
443991 
444670 
444931 
446102 
446163 



AL35S0S5 

All 10684 

BE218239 

AV^53S6 

AA876372 

»^Q02250 

H58373 

AV65203O 

AW168067 

AAO26S80 



447383 
447532 
448243 
448844 
449444 
451807 



453392 
453464 
453735 



AWB30534 

AKD00614 

AW359ni 

A15B1519 

AW818436 

W52854 

F33868 

U237o2 

AI884311 

AI066629 



H&57709 

H&.7045 

HsiaSS6 

HsJ645 

KsJ93g61 

HS.1Q082 

Hs33293a 

Hs.^113 

H&d17694 

H&2SK2 

H&15113 

Hi76277 

Hs.18791 

Hs^620 

H&177164 

Hs^O 

HsJ84176 
Ks^2964 
Hs32989 
HS.12S073 



Hcmo S2fiiens nfiWA &fl tengib osat cON 

flvtogea B beta polypepgde 

ESTs 

fixlnogen, B beta potTpepGda 

Homo sapieos mRMA: cONA OXFZii667D0SS (fr 

potassium intennediateJ^nidl oooductanoe 

bypolhellcal proton liilGC5370 

getffiral trsnscrip&o fixtor filA 

ESTs 

Homo sapeos cOriA RJ13G03 fis. dons PL 
hoinogentisale 1 ,2-dkBcygenase Onmogenti 
Hono sapiens, chjoe £460:9381. oiRNA, cofnp 
hypoiheScai protein FU20607 
Integria beta 8 
ESTs 

sdute carrier family 16 (monocaiboxyQc 
hypolhe6cai protein FU23293 siroBar to 
transfefrin 

SR Y (sex detemming region Y)-box 1 1 
Feoe(tfDr(i 
ESTs 





21J00 


1.41 


99i» 


22m 


liX) 


1.00 


16.00 


1^ 


159 


5.71 


6.87 


1.98 


38.00 


1.00 


54XK) 


1.00 


1.00 


1.00 


35.00 


1.00 


11.00 


U4 


1.16 


1.23 


1.63 


15.84 


1.00 


1.00 


31.00 


1.00 


83-00 


1.55 


35.00 


1.54 


1.44 


1.00 


\6M0 


1.55 


145 


1.01 


1.30 



TABLE 11B 

Ptey: Uruque Eos prabesetideiiSlier number 
CAT number Gene cluster number 
Accession: Genbank accession oumfaeis 



Pkey 
410399 



CAT Number 



419502 18535_V 



421582 2041J 



437865 44433.2 



451807 8865.1 



BE0a889 BE068882 ARWSII AF017255NM.003Q87 AF037207 AF010126AA633976AA872836BE29M2SKa98©P^^ 

aS^/^^5AA394097AI139933AA946606BE171313AA722407AA293803AI4684W 

J^1W37H^AA4^2M4^^^ 

^ST74£4T74^mW8T^^^ 

TSTW3T71800T68355 T61227T62738 T69317 T53860T64692T73768 T739gT2^ 

770498 T6 1409 T58925 NM 000508 M64g82 T68301 T73729 TB9445 Te0424 T67922 T67736 T68716 T67755 n4765 T73819 T58719 

T71916T73787T56035T64425T71870T60476 T61376T67820 T71895T41006TC^ 

H4^ T71914 T53939 T64121 AA693396 T72525 T67779 T68078 AA01 1465 AA345378 AV654847 AV654272 AV656001 A!064740 Ta2897 

S^S6HSra9H38^ 

a5m7077V>W^^ 

S52T^^^7tK 

T69283 T73931T72178T72456AV645639AV653476T72957 T72300 T5B906 T714S7T70494T72955 T704^^^^^ 

aSS726 T27854 T74485 T74101 T73868 T71S18 T72304 AA3433S3 T73909 T68070 T72065 H72149 T73493 T73495 AV64S993 R02293 

TTC^S T64751 AA344441 AA343657 AA345732 AA344328 A1110639 AA344603 AF063513 T64696 T68516 T72223 T60507 T67533 R29500 

T^8St6B258AV65C429 T73341T61702T74598 T40095KD2272 T40106AA^^ 

T53747 T72042TB2764 A1064899 AA343060 T67832T72440T71770 T68091 T69108T72449T69167T71289^T58M1 iW854^ 

AA345234 T67598AA011414T6afl36H48262AJ207557 T582l9W8603lT69081T64232R93l96 

AA344583 T60362 H58121T95711T72803 T68055 T7m5R29036T72793T69122T6459ST6^ 

AAS3592 AI248502 R29454 T54754 T57001 T73052 T71429 T51 176 T58866 AVB55414 H9 

T67837 T73317T74273T69420 T68245 T74380T67862 T74474 T56068 ^ 

W910275 X00474XS2003 X05030NM.003225AA314326AA308400AA5067B7AA3148^^ 

Si2aJ£14409AA307578AI925552AW550155AI910083M12075BEO^^ 

BE074140 AA514776 AA588034 88374051 BE074068 AW009769 AWOSOSSO AA858276 R55389 AI001051 AW050700 AW750216 AA614539 
BE074045 A1307407 AW602303 BE073575 AI202532 AA524242A1970839 Ai909751 BE076078 A1909749 R55292 
AA156781 AW293839 U52054 AA024963AA778446 BE073977 AW444904 AW602574 BE164040 BEt64012 BE163972 BE163974 BEV^ 
AA837481 AVW68444 BE185091 AW46800ZAA687333AA81I830AA581806AI866686AI572124AA043777 AA04O926 O20160A1536733 
AA81 2489 AW8741 42 AI471883 V\IB4421 AA156650 

W52854 AL117600 BE208116 BE208432 BE206239 BE082291 AW953423 AA3S1619 BE180548 BE140S60 W60080 AA865478 N90291 
AI^^AW449519AA593634A1806539AA3S1618AW449522AI827626AAW 



TABLE 11c 

Pkey: Unique number coiTCSpomfmg to an Eos probesat ^ , , . ^ ur ai-jTv-nu* 

Ref: Sequence source. The 7 digit numbers in ItMscoUaiin are Genbank Identifier (Q) numbers. TXmhamLetaL' refers to the publication enfiOed*n»ONA 

sequence of fiuman chromosome 22' Dunham L et aU Nature (1999) 402:489495. 
Strand: Indicates DNA strand Ihim which exons were predicted. 
Ntjjosi&on: tndicales nudeotkld positions o( predicted axons. 



Pkey Raf 
403329 8516120 
406399 



Strand 
Plus 



Ntpositibn 



63448^3554 
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TABLE 12A: Genes DisBi^iilsMt3gSquan«»CdCadnoraa to 

Tdbte 12A sho» about 72 genes upregiflaW to s<j^^ 
were seiacted to about 59680 prabesete on tl» EosjMQfnB(^ 

TaWoi9B<ih«wihflmsskmminibere for those Pk^^ Fa each probeselwha« listed Oie gene ckisternumb^ 

sWlai^ustagCksteriifla^ The Gcnbankaocessnnnuirtos for sequent 



Table 12CshewlhegemwKposffionmgforlhosePkB/sla^ each piedaedejcon,«e hare listed fee genonw 

sequencesouroeusedebrpredictfan. NudeolidelocaBons of each prejfictedexon are also Ss^ 



Ptey: Unique Eos probesetidenliSer number 

ExAocn: Exconplar Accession ouniber. Geidiank acoesskn number 

UnigenelD: Unigene number 

Un^ene TiCe: Unigene gene title 
R1: 



R2: 

Pkey 

400289 
400866 
401760 
401781 
401785 
401994 
402075 
404996 
407839 
408000 
408522 
410561 
415091 
415817 
416658 
417034 
417386 
418663 
418678 
419121 
420783 
421773 
421948 
421978 
422156 
422440 
423634 
423725 
423738 
424012 
424046 
424098 
424834 
425650 
427099 
427335 
428182 
428645 
428748 
429259 
429538 
429903 
430486 
430890 
431009 
431846 
433091 
434360 
434880 
435505 
435793 
436511 
438403 
439285 
439606 
43S570 
439706 
440325 
441525 
443162 
444378 



Amie of lung tumors C«ncbiding squamous eel carcinomas, adenocarcinomas, small ceQ cardnomas. granulomatous and carcinoid tumors) divided by the 
S^^SS^SS^disease samples 



ExAccn 
X0782O 



UnigenelD 
Hs^ 



AA045144 

L1 1690 

AI541214 

BB40255 

AL044872 

U86g67 

U03272 

NM.006183 

BE185289 

AKD01100 

NM_001327 

AA374372 

AI659838 

W69233 

L42583 

AJ243662 

L10343 

NM_004812 

AW959908 

AJ403108 

AB002134 

AW368377 

AF027866 

AF077374 

AK001432 

KNL001944 

AB032953 

AA448542 

BE386042 

AA431400 

AW593206 

AA420450 

BE182592 

AL134197 

BB)62109 

X54232 

8E149762 

BE019924 

Y12642 

AW015415 

U02388 

AFa00492 

AB037734 

AA721252 

AAS06607 

AL133916 

W79123 

AF08BO76 

AW872527 

NM.003812 

AW241867 

T49951 

R41339 



Hs.151566 

Hs.620 

Hs.46320 

Hs^g94 

Hs.77910 

Hs.78867 

Hs.79432 

HS.809S2 

Hs.1076 

Hs.41690 

Hs.87225 

KS.eg626 

Hs.99923 

Hs.112457 

Hs.334309 

Hs.110196 

Hs,112341 

Hs.116724 

Hs.1690 

Hs.132127 

Hs.132195 

Hs.137559 

Hs.138202 

Hs.139322 

Hs.153408 

Hs.1925 

Hs.173560 

Hi251677 

Hs^lT 

Hs.98729 

HS.9878S 

Hs.292911 

Hs.11261 

Hs.93597 

Hs.241551 

HS.26S9 

Ks.48956 

Hs.271580 

Hs.3185 

Hs.127780 

Hs-101 

HS211238 

Ks.4993 

H5^91502 

Hs^92206 

Hs.58561 

Hs,59507 

Hs^9761 

Hs.7164 

Hs.127728 

Hs.9029 

HS.12S69 



Unigene Tifle 

matrix netaloproteinase 10 (straroelysin 
NM_00242Sitonw sapiens matrb( metsaopro 
NMJ»5557''iiomo sapiens keratin 16 (foca 
Target ExoA 

NM-Q02275'atano sapens IcaraSn 15 (KRT1 
Target &(on 

EKSP0000Q2S1056*:Ptasma membrane caldum 

Target Exon 

ESTs 

bullous pemphigoid antigen 1 (23Q/240kD) 
Small praCne-ridi proton SPRK pujman. 
Homo sapiens cDNA: RJ22044fis.doneH 
3-lTydroxy-34nethy)gIutaryU>)enzyme A sy 
pretein ^rosina phosphatase, receptor^ 
fibrilSn 2 (congenSd contradurd ara 
neurotensin 

small proiine-dch prot&n IB (comifin) 
desmocoilin3 



parathyroid honnone#e honnone 

lectin, gatactoside^iindlng, solubie, 7 

ESTs 

keralin6A 

NICE-1 protan 

protease inhibitor 3^ skin-derived (SKAL 
ddo^eto reductase family 1 , member B1 0 
heparin-tKnding growth factor binding pr 
hypotheiicai protein LOC57822 
airway trypsin-Cke protease 
tumor protein 63 kDa with stroiig homolog 
setHie (or cysteine) proteinase inhibito 
small proSne-rich protein 3 
Homo sapiens cDNA fU10570 lis, done MT 
desmogletn 3 (pemphigus vulgaiis antigen 
odd OzAen^ homolog 2 (Drosophila. mous 
G antigen 78 

ESTs. WWWy similar to GGC1_HUMAN G AMI 
ESTs. WfedWy similar to 201 7205AdlhydfO 
Ksp37 protein 

ESTs. Highly Mat to S60712 band^pr 
small proTine-rich protein 2A 
eydtiHlependent kinase 5, regutatoiy su 
ddoride channel, cdldum activated, Cam 
glyplcan 1 

gap junction pmlein. beta 6 (oonnexin 3 

unop^aJonlB 

lymphocyte antigen 6 complex. locus 0 
ESTs 

cytochrome P450. subfamiiy IVF, polypept 

interleukin-1 homdog 1 

KIAA1313 protein 

ESTs 

ESTs 

hypothetical protein FIJ20093 

G protein-cou|^ receptor 87 

ESTs. Weakly sindar to AC004858 3 U1 sm 

ESTs. Weakly sinvlar to DAPLHUMAN DEATH 

a dislntegrin and mstaBopnitdnase doma 

ESTs 

OKFZP434G032 protein 
ESTs 



ni 


R2 


132.45 


4.00 




3.22 


"iR A7 
Z0.4/ 


10.50 








2.70 


CI UA 


A7 nn 


1 AO 
l.w 


1 00 


i.UU 


1.00 


If a.91 


108.00 


19). If 


8.00 


1 OQ 
1.90 


1.24 


1A t\A 


1.00 


I.UU 


JU.UU 


Oil 


1 nn 

I.UU 




51.00 


I.UU 


I.UU 






1 IZ.1/ 


nn 


1 1& 
1.10 


1 in 

1. lU 


4 iVt 
I.UU 


1 nn 

I.UU 


1 t\A 


I.O 


1 10 


1.14 


01.00 


20.25 


l.Ul 


0.91 


0 tJ 

£.or 


1.10 


A7 K'i 


32.00 


7RM 


1,00 


4.ZU 


1.00 


10,14 


51.00 


2^.42 


68.00 


1.00 


1.00 


137.82 


54.00 


56.19 


12.00 


33.45 


t.00 


4.24 


17,00 


51.83 


4.00 


1.00 


1.00 


1.00 


16.00 


1.00 


87.00 


2.01 


1.18 


4.43 


190 


11.80 


1.00 


12.28 


41.00 


1.58 


1.40 


60^ 


28.00 


4.49 


2.51 


1.20 


1.09 


40.98 


27.00 


1.00 


1.00 


1.00 


38.00 


23.68 


4100 


16.76 


14.00 


1.00 


1.00 


46.23 


139.00 


33X1 


1.00 


1.00 


1.00 


86.55 


11.00 


62.88 


147.00 


1.53 


1.42 


31.11 


38.00 


1.00 


1.00 
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4<6292 AF081497 Hs,279S82 Rb type C ^yopn^dn »-2B 

447079 AW88S727 Hs5914 ESTs Sm im 

447342 A1199268 Hs.19322 Htmwsapcer^ Similar to RIKENcONA 2010 28^63 l.W 

449003 X7o342 Hs^ eta*oldehj(*og8nasa7(dasslV).ira 100 U» 

449101 AA2D5847 Hs^16 G pnAsiitaxifkA receptor ^ 

450832 AW9706O2 Hs.105421 ESTs 

4S2240 A1591U7 Hs.61232 ESTs ]3« 

453317 ^iM.002277 Hs.41686 ferafin, har. acRficl 1-« Ifr^ 

453830 AA5342SS Hs.20953 ESTs f f 

4S4098 VC7953 Hs592911 ESTs,W5MysiraaartoS60712band^ Iffi J" 

455801 AI358$30 H&816 SRY(sexdelaniiiiiii«ie^Y)4i(w2 205.11 i.w 

TABLE 12B 

Ptey. Ufliqua Eos pfobeset Ueflfiiier number 
CAT number. Gene cluster number 
Accession: Genbank accession numbers 

^5 SSwf" jS^N79113ATO8610tM76721AW950a28AA3540l3A 

439285 47065.1 J^^Sl »SS3AA626243AI341407BE175639AA456968AI358918AA^^ 

TABLE 12C 

sequence of human chromosome 2Z* Dunham I et atf., Nabse (1999) 402:489^95. 
Strand indicates ONA strand ftom which exonsvimre predict 
NLposilion* Indicates nudeoSdeposiSons of predicted exons. 

NtjKstion 

^7-a617.^j29045.29135-29296.29411-29567,29705.2^^^ 
8321M343S.83531^6.8374a8390134237^3.84955^5037.8e9t^^ 
16577&-I659bj66189.l66314.166408.165559j671l2-I6725ai6738^^ 
4290443124 432l1-43336.44B07-44763,4519945281.46337-46732 
121907-122(B5.122804-122921.124019-124161,124455.124610.125672-126076 



Pkey 


Ref 


Strand 


400568 


8118496 


PhJS 


401780 


7249190 


Minus 


40\781 


7249190 


A&uis 


401785 


7249190 


Minus 


401994 


4153858 


f£nus 


402075 


8117407 


Ptus 


404S96 


60O78SO 


Plus 
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TABLE 13A: Genes Dis&^uishiagNQi^Mal^DantUingC^^ 

TaUe 13A shffis aboul 23 genes upregid^ to noo4idi^ 
fhe EosM^mdrix Hu03 Geoedui> array. 

^tw" iha niffnhps faf fliQsa Pkev's taddno UmQeneHys tor tefate 1 3A. Fbf each probeset to have listed the gene duster nunto 

cSgomx&oSdesweiBde^ned. Gene dusteswereaxnpfled using seq^ These s&^^ss^ fi^d^ t^m 

'Accessbti' column. 

Tdteiacshwftagenonifcpoafioningto For each pcedWadexoo, we ham fisted ftegewiiric 

sequence sotDoe used for preifiction. Mjde^riidelocaSaos of each predicted es» 



15 



20 



25 



30 



35 



40 



45 



Pksy: Unique Eos probes el id en fi fer mmiber 

ExPccn: Exeinplar Accession number.C " ' 

UiugenelD: Uingene number 

^^^^ irtS^^bmgtoi^ 

R2: JS^SlSISSSSrdiseasesaii^itesr^^ 



Ptey 


ExAccn 


UtdgenelO 


IMgenBTifle 


40K62 


AI436323 


Hs^l141 


Homo sapiens oiRNA for KiAA1568 protein. 


409031 


AA376836 


Hs.76728 


ESTs 


412372 


R&599d 


Hs.285243 


hypothetica] protein FU22029 


415910 


U20350 


Hs.78913 


chemokine (C0(3^ receptor 1 


417511 


AU)49178 


Hs^3 


Ghordff^&ke 


418819 


AA228776 


Hs.191721 


ESTs 


422060 


R20893 


Hs^23 


ESTs. Moderately simlbrtoALUSJKUMAN A 


424585 


AA464840 


Hs.131987 


ESTs 


426753 


T89832 


Hs,170278 


ESTs 


42S496 


AA453800 


H&192793 


ESTs 


430719 


AA488968 


Hsi93796 


ESTs 


431089 


BE04139S 




ESTs. Wealdy similar to unknown proton 


431385 


BE1 78536 


Hs.11090 


membrane-spanning Anlomains, subfainlyA 


431728 


NM.007351 


Hs.268107 


mulfimerin 


436532 


AA721522 




gb3iv54h12j1 Na_CGAP_Ew1 Homo sapiens 


437980 


A1869586 


Hs.222194 


ESTs 


438202 


AW169267 


Hs.22588 


ESTs 


441499 


AW298235 


Hs.101689 


ESTs 


444513 


AL120214 


Hs.7117 


ghitamate receptor, ionotropic. AMPA 1 


448253 


H25899 


Hs.201591 


ESTs 


453636 


R67B37 


Hs.t69872 


ESTs 


458332 


AI000341 


Hs.220491 


ESTs 


459587 


AAOSISSe 




gtKzlc1Ss044l Soare8jiregnant.utsnisJ^ 



Rl 


R2 


1.00 


230X0 


m 


128.00 


1X0 


173.00 


1.00 


145.00 


1.00 


179.00 


1.00 


140X0 


1.00 


15&00 


1iX> 


167.00 


1.00 


141.00 


1.00 


13&00 


1.00 


moo 


a32 


941.00 


1.00 


167.00 


1.00 


157X0 


1.00 


218X0 


1.00 


147.00 


1.00 


141.00 


1X0 


167.00 


1.00 


151.00 


1.00 


141.00 


1X0 


116X0 


1X0 


19Z0O 


1.00 


154.00 



TABLE 13B 

50 Pkey: Unique Eos probesettdenSfier number 
GAT number Gene duster number 

Acoessiwi: Genbankaoeessfon numbers 

Pkey CAT Number Accession 

55 431089 327825 1 BE041395AA491826AA621948AA71S980AA666102 

436532 421802.1 AA721S22 AW975443 T93070 



60 
65 
70 



TABLE IX 

Pkey: Unique numbercorrespondlng to an Eos prabeset , . ^ .u.j-ri. nu* 

Reft Sequence source. The 7 dgil numbers in this column are GenbaiA Identifier (GO numbers. Dunham I at at reiiers to the publication entitled The DNA 

sequencoof human chromosome 22.* Dunham LetaL. Nature (1999) 402489-495. 
Strand: tndtcates Dm strand from which exons were predicted. 
Nt_p«ilion: Inmates nudeofidepQsiSQns of predictfidexons. 



Pkey 
402075 



Ref 

8117407 



Pius 



121907-12203S,122804-12292U2«)19-124t$1,1244S$-124610.12S67M26076 
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TAa£t4A: PtefenedU^andSubceB^LocdKaGonforPo^^ 

Tabtel4Ashowsthesubcdul»lxdba6ona«lpr8l3ff^ ii^syn*<tenimKtoadana)ody,<^^ 
dagnoslic. sja synAofizes sma9mo!^ 
Genedup array. 

Table WBshwAieaccessunnumbas far those Pke/sl^^ For each probesatvffi hare IstEdOtegeiiechJstefouiiitf 

ol^omdeoedestferedes^ Gene dWewirarBcaivted tang sequent These seqieoottiwdtgtoBdhaM^^ 

stnMy using Clustering and /%tfnenlToals{D^ The Genbankaccessiaiounibefs for sequences oonq^^ 

'Aoce$s)a)*oohiinrL 

Table 14C show Ihe genomic positiooJng far those Pto/sl^^ For each predidedexon, we ham fisted fte genomic 

sequericesouree used tor piedidion. NudetAtetocaBonsdeachpfedidedexon are also fisted 



Pkey: Uroque Bos probeset identifier m/mbef 

Extoat Exemplar Accession number, Genljank accession nuniber 

Umgenein Ur^ne number 

Unigene We:. Unigene gene title 

Pmf.Utifity: PmfenedUtiSty 

Predloc Pte(£dBd subcellular locaSzalion 



Pkay 


ExAccn 


Uni^enelO 


Unigene TISe 


PiefUtilily 


400289 


X07820 


^.2258 


rnatrix metaHoproteinase 10 (stnxnefystn 


mAbAdiag&sm 


400303 


AA242758 


Ks.79136 


UV-1 protein, estrogen regulated 


mAb 


402075 




ENSP000002S105G*:Piasmafflembranecakiufn mAb&<na9 


407811 


AW190902 


HS.400S8 


cysteine knot superfamBy 1, BfyP antagon 


diag 


408243 


Y00787 


Hs.624 


interieuldnS 


dtag 


408790 


AV\ra80227 


Hs,47860 


neurotrophic tyrosine kinase, leceptor. 


mAb&sJit 


408S08 


BE296227 


Hs.250822 


serineAhreonine kinase 15 


sm 


409041 


AB033025 


H5.SQ081 


Hypothetical protein. XP_051850 (KIAA1 19 


CTL&(fiag 


409103 


AF251237 


HS.11220B 


XAGE-1 proton 


ca 


409420 


Z15008 


Hs.54451 


laminin, gamma 2 {nicein (lOOkO), kalini 


diag 


409632 


W74001 


Hs.55279 


serine (or cysteine} proteinase tnhiUk} 


dlag 
(Sag 


409757 
409893 


NM«001898 


Hs.123114 


cystafin SM 


AW247090 


Hs.57101 


minichromosome maintenance defident {& 


CTL 


409958 


AW103364 


Hs.727 


Inhlbin, beta A (acSvin A. adivin AB a 


diag 


410001 


AB041036 


Hs.57771 


kallikreinll 


(Sag 

mAb&sm 


410407 


X66839 


Hs.63287 


carbonic anhydrase IX 


410418 


D31382 


Hs.63325 


transmembrane protease, serine 4 


mAb & diag &s.nL 


412140 


AA219691 


HsJ3525 


RAB6 interacting, kinestn-like (rabWnes 


S.ffl. 


412719 


AW016610 


Hs.816 


ESTs 


sjn. 


414774 


X02419 


Hs,77274 


plasniinogen activator, urokinase 


diag 


414883 


AA926g60 




COC28 protein Idnase 1 


sja 


415138 


C1B356 


HS.29S944 


tissue factor pathway inhibitor 2 


GTL&diag 


415669 


NM_005025 


HSJ8589 


serine (or cysteine} proteinase inhibtto 


mAb & diag &sjn. 


415817 


U88967 


HS.78B67 


protein tyrosine phosph^e, receptor-t 


mAb&sJTk 


416658 


U03272 


Hs.79432 


fibriEin 2 (congenital contradural ai^ 


diag 


417034 


NMl.006183 


Hs.80962 


neurotensiA 


(Sag 


417079 


U65590 


H5.81134 


intedeukin 1 receptor antagonist 


diag 


417308 


H60720 


Hs.81892 


KIAAOIOI gene product 


s.ra 


417389 


BH260964 


Hs.82045 


midkbie (neuiite growth-promoting fador 


mAb & diag 


417433 


BE270266 


Hs.82128 


5T4 oncofetal trophoblast glycoprotein 


mAb 


417933 


X02308 


Hs.82982 


thymidylate synthetase 


sm 


418478 




nS.I 


cycBn-dependenl kinase inhibitor 2A (me 


sm 


418506 


AA084248 


Hs.85339 


G protein-coupied receptor 39 


mAb&sjn. 


418678 


NM^001327 


Hs.167379 


cancar/testis antigen (NY-ESO-I) 


CTL 


419121 


AA374372 


Hs.89626 


parathyroid hormone-like homnne 


diag 

mAb&s.m. 


419171 


NM_002846 


HS-8985S 


protein tyrosine phosphatase, receptor t 


419183 


U60669 


Hs.89663 


cytochrome P450, subfamay XXIV (vitamin 


CTL&sm 


419216 


AU076718 


Hs.164021 


sm^ Inducible cytokine subfamily B (Cy 


diag 

mAb&diag 


419235 


AW470411 


Hs^33 


neurotiimin 


4194S2 


U33S35 


HS.90S72 


PTK7 protein tyrosine kinase 7 


fflAb&sm 


419556 


U29615 


Hs.910g3 


chitinaso 1 (chitotriosidase) 


mAb&diag 


420610 


A!883183 


Hs.99346 


dista!-lesshomeobox5 


CTL 


421110 


AJ250717 


Hs.1355 


cathepsinE 


sm&diag 


421379 


Y15221 


Hs.103962 


smaO hdudbie cytokine subfamily B (Cy 


diag 

mAb&s.m. 


421474 


U76362 


Hs.104637 


sohite carrier family 1 (glutamate trans 


421552 


AF025692 


HS.10S700 


secreted frizzled-related protein 4 


dlag 


421753 


BE314828 


Hs.107911 


ATP-binding cassette, sub-family B (MDR/ 


mAb&sjTi 


421817 


AF146074 


H&.108660 


ATP-binding cassette, subfamily C (CFTR 


mAb&s.m. 


422109 


S73265 


Hs.1473 


gastrin^eleasing peptide 


diag 


422158 


L10343 


Hs.112341 


pmtease inhibitor 3. skin<derived (SKAL 


diag 


422282 


AF019225 


Hs.1 14309 


apoSpoprotein L 


diag 


422283 


AW411307 


Hs.11431t 


COC45 (ceS divisnn cyde 45, Sxerevis 


SilL 


422424 


AI186431 


Hs^96638 


prostate differentiation fador 


diag 


422765 


AW409701 


HS.1S78 


bacufavira! lAP repeaUontaining 5 (sur 


sjn. 


422809 


AKG01379 


Hs.121023 


hypothetical protein FU 10549 


sjn. 


422867 


132137 


Hs.1584 


cartilage oligomeiic matrix protein (pse 


diag 


422956 


BE545072 


Hs.122579 


ECT2 prdein (Epithelial ceD transfama 


CTL&sjn. 


423834 


AW959g08 


Hs.1690 


heparin-binding growth factor binding pr 


(Sag 

mAb&diag&sm 


423673 


BE003054 


Hs.1595 


matrix metaIk)protelnase 12 (macrophage 


423961 


D13666 


Hs.135348 


periostin (OSF-2os) 


mAb&diag 


424046 


AFQ27666 


Ks.138202 


serine (or cysteine} proteinase inhibtto 


diag 


424381 


AA285249 


Hs.146329 


protein kinase Chk2 


sjn. 



Pred-Loc 
extracelh^ 
plasma membrane 



plasma membrane 

cytoplasm 

secreted 

nuclear 

secreted 

seiveted 

extracellular 

nuclear 

extracellular 

extracellular 

plasma membrane 

plasma membrane 

nuclear 
extracellular 



plasma n^mbrane 

extracellular 

extracellular 

extracellular 

mitochondrial 

secreted 

plasma membrane 
endoplasmic reticulum 
cytoplasm 
plasma membrane 
cytoplasmic 
secreted 

plasma memfafane 
mitochondria] 



plasma membrane 
plasma membrane 
extraceflular' 
nuclear 



secieted 

plasma membrane 



plasma membrane 
plasma m^brane 



nudear 

BxfracflBular 

cytoplasm 

nuclear 

extraceBuiar 



secseted 
exbaceOuIar 



\ 
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424502 AF242388 Hs.14^ 

424503 NM_002205 Hs.149609 
424687 J05O70 
42S247 NMJ00594O 

5 425322 U63630 

425650 KM.001344 Hs.ig25 

42S734 AF05S209 Hs.15939S 

425776 U25128 

425852 AK001504 

10 428215 AW963419 

426427 MS5689 

426514 BE616633 

427335 AA448542 

427747 AW411425 

15 428242 H557Q9 

428330 L22524 

428450 NM_014791 

428479 Y00272 

428484 AF104032 

20 428664 AK001666 

428598 AA852773 

428748 AW593206 

428758 AA433988 

428969 AF120274 

25 429211 ARJ52693 

429263 AA0t9004 

429547 AWQ09166 

429610 AB024937 

429903 AL134197 

30 430486 eE062109 

431482 AW583672 

431515 NM_012152 Hs^58583 

431846 BE019924 Ks^71580 

431958 X63629 

35 432201 AIS38613 - 

433001 AF217513 

435505 AF200492 

43648V AA379597 

437016 AU076916 

40 437044 AL035664 

437789 AI581344 

437852 BE001836 

439223 AW238299 

439477 W69813 

45 439606 W79123 

439738 B£246502 

440006 AK000517 

441362 BE614410 

442117 AW664964 

50 443247 BE614387 

443426 AF098158 

443859 NM_013409 

444006 8E395085 

444371 BE540274 

55 444381 BE387335 

444781 NM_O1440fl Hs.11950 

445537 AJ245671 Hs.12844 

446619 AU076643 

446921. AB012113 

60 447033 AJ3S7412 

447342 A1199268 

448243 AW369771 

448844 AI581519 

449048 245051 

65 449722 BE280074 

450001 NIUL001044 

450375 AA009647 

450701 H39960 

450983 AA305384 

70 451658 243948 

452281 T93500 

NM_007115 

452747 

75 
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Hs.151738 
H$.1S5324 
HS.15S637 



HS.15S4® 

Hs.159651 

Hs.155223 

HS.189S40 

Hs.170195 

Hs^51677 

Hs.180655 

Hs^O 

Ks,2256 

Hs.184339 

Hs.334562 

Hs.184601 

Hs.189095 

Hs.334838 

Hs^785 

Hs.98502 

Hs.194689 

Hs.198249 

Hs.1983^ 

Hs.99376 

Ks^11092 

Hs.93537 

Hs;>41551 

Hs^56311 



Hs^877 

Hs^a241 

Hs^79905 

Hs^11238 

Hs.5199 

Hs.5398 

Hs.69517 

Hs.127812 

Hs.256897 

Hs.250518 

Hs.58042 

Hs.58561 

85.9598 

Hs.6844 

Hs.23044 

Hs.1 28899 

Hs^33893 

Hs.9329 

Hs.9914 

Hs.10086 

Hs^S 

Hs.283713 



H$^13 

Hs.16530 

Hs.157601 

Hs.19322 

HS.S2620 

Hs.177164 

Hs.22920 

H&23980 

Hs.406 



452401 

452747 BE153855 

452838 U65011 

453968 AA847843 

457489 A1693815 



Hs.288467 
Hs.25740 
Hs.326444 
Hs^8792 
Hs.29352 
Hs.61460 
Hs^0743 
Hs.62711 
Hs.127179 



intsgrin. alphas (Gbfonecfia recepSor. 
matrix (netsSopn^eiaasd 9 (ge!a&as8 B 
matrix metaSoprotsnase 11 (stromelysin 
poletn kinase, ONA^cSvated.calalySc 
desmogton 3 Cpempfc^us vulgaris aaSgen 
pep&^^Jydne aipha^nddaSng monocxyg 
paratfcyrcud hormona fecapbr 2 
death receptor 6, TNF supeifansly member 
stanniocakan 2 
TTK protein kinase 

txme morf^iogaietic protein 7 (osteogenb 
GantigenTB 

serine/Oveonine kinase 12 

leukemia inhibitDiy tsctor (cho&nog^ 

matiix metaOopn^anase 7 (matnlysin. 

KIAA0175 gene product 

ceO (fivisidn cytte 2. G1 to S and G2 to 

sdute canter faniity 7 (catkuiic amiDO 

similar to SAtll {sal posophla)«e 

K1AA1866 protein 

Ksp37protian 

CA125 antigen; mucin 16 



gap junction protein. t)eta 5(connexin 3 
ATP-t)inding cassefls. suMamOy A(ABC1 
ESTs 

lUNX protein; PLUNC (palate hing and nas 
cyc&wlependeni kinase 5. reguiatoiy su 
chtaride channel cateJum acfivated, fera 
granin-Ttka neuroendocrine pe^ precu 
endothefiai differenSation. lysophospha 
uroplakinIB 

cadherin 3, type 1, P^dherin (placenta 
Transmentaie protease, serine 3 
ctoneHC30310PRO0310p1 
inlerieukin>1 homotog 1 
HSPC150 protein similar to ut»quitinK»n 
guanine monphosphate synthetase 
difierentially expressed in PanoonTs an 
ESTs. Wdidy sin^ to T17330 hypoltieG 
ESTs.VfeaklysimgartodJ36501Z1 |H^ 
UL16 landing protein 2 

ESTs. Moderately similar to GFR3.HUMAN G 
G proteift<oufded receptor 87 
sema domain, immunogiobunn domain (Ig), 
NALP2 protein; PYRirWkjittalning APAF1-D 
RA051 (S. cerevlsiae) homolog (E coTi Re 
ESTs; hypothetical protein for IMAGE*447 
c-Myc target JP01 

chromosome 20 open reading frame 1 
tbilistatin 

type I transmembrane protein Fn14 
forkhead box Ml 

ESTs, Weakly simllarto S640S4 hypoBteti 
GPI-anchored metastasis-associated prate 
EGF^ike-domain, muKiple 6 
secreted phosphoprotein 1 (osiecpontin, 
small indudtiie cytokine subfamily A (C^ 
ESTs 

Honra sapiens, Similarto RiKEN cDNA 2010 

integrin.beta8 

ESTs 

Similar to S66401 (cattle] glucose indx 
cycGnBI 

solute carrier family 6 (neurotransmitte 
a disintegrin and metaOoproteinase doma 
hypothefical protein XPJ)98151 (leucine- 
ER01 (S. cerBvisia0]-Gke 
cartilage addle protein 1 
Homo sapiens cDNA RJ1 1041 fc. done PL 
tumor necrosis factor. aiph»nduced pro 
Ig superfamily receptor tJ^lR 
preferen^aily expri^sed anGgen in mala 
High mot^ty group (nonhistone dnomoso 
ciypticgene 



sm 

mAb&sm 

diag 

mAb&i£39&sja 

sm 

mAb 

sm 

niAb&diag 

mAb&sm 

mAb&diag 

CTL&sm 

mAb&diag 

CTt 

sm 

diag 

mAb&diag&sm 

sm 

sm 

mAb&sm 
CTL&sm 
mAb 

diag 
diag 
diag 

mAb&sm 
mAb&sm 
tfiag 

mAb&diag 
s.rn. 

mAb&sm 
diag 

mAb&s.ni 

mAb&diag 

mAb&diag 

mAb&diag&sm 

sm 

sm 
Sin. 
CTL 
CTL 

mAb&sm 
mAb 

mAb&sm 

mAb&s.nL 

mAb&sm 

sm 

sm 

mAb&sm 

CTL 

CTL 

diag 

mAb 

sm 

drag 

mAb&diag 
mAb&diag 
diag 
diag 

CTL & diag 
CTL 

mAb&sm - 
mAb&s.m. 
mAb 
sm 

mAb&sm 
mAb & diag &s.m. 
mAb&diag 
diag 

mAb&diag 



cytaplaanic 



cxtiaoelb^af 
secreted 
Qftaplasnu: 
plasma membrane 



nudear 



qftofdasiric 
cyh$iasmic 

extracellular 



plasma memtirane 

nudear 

extracellular 
mitocbodiiar 
extraceBuiar 
ptesna mendjrane 
plasnta fluroltfane 



plasma membrane 



plasma membrane 
plasma membrane 
plasma membrane 
plasma membrane 
nudear 



cytoplasm 
ER 

nudear 

plasma membrane 
plasma membrane 

plasma mendirsne 
plasma membrane 



plasma membrane 
extiacellulaf* 



plasma membrane 

nudear 

secreted 

plasma membrane 

secreted 

secreted 

extracellular 

secreted 

plasma membrane 

plasma membrane 
cytoplasm 
plasma membrane 



mAb 
CTL 

CTL&sm 



plasma membrane 
secreted 

plasma memtirane 

exfraceDular 
plasma membrane 
nudear 
nudear 
secreted 



TABLE 14B 

80 Pkey: Unique Eos probeset identifier number 
CAT number Gene duster number 
Acoesston: Gentiat&aoces^on numbers 



Pkey 



CATNumber 



Acsoesston 



1S2 



wo 02/086443 PCTAJS02/12476 
414883 1S024 1 AA92fi960AA9269aW76S21W24270W2152oAA037172BE267636H8318SAMfi99Q9N86^ 
- AA0824»H72SSH77575 N497ffiWfflBSH78746BES9085VVD4339 R98127 T5S^ 

AA2927S3AA17^)43^aL0018Z5X54S41 BE314366AA9O8783AI7l9O75BE27(n72BE26S819AAE^AI204630VV25243Ae35 
/WW2039VV72395T99630Ai42269iH984SOK31428BE255916H03265A!^ 

R759S3 AVW62396 AA6e2S22 AI865147 AM23153 AW2B2230 AAS84410 AA583187 AVl'0245K AW069734 AI828^ AA28ag7 Mfl76046 

AVVS13002AAS27373AVV9724S9A!831360AA621337AA1()0926AA772418AA594628 

N95210 AI4S432 AI041437 AA332124 AA6276S4 AA9358S AHJ04827 AJ423513 AI094597 m2079 RW^ 

AA643280 VmSSI Ae9198S AiS37692 AH)90252 AA74081 7 AI312104 AB1 1822 AM15871 An854ffl 

A1139549 AA533648 AI339996 AI33S880 AA399239 AI078708 A1085351 A1362835 AI3466t8 AI14o955 Ai989380 AI348243 N92892 AA765850 
AI494230AI278887AA962S98AM92600VV80435AAD01979R97424A!129015N2412^ 

AM94211 AVTOSSaOl AW886710 R327S0 N59755 A1361128 AW589407 H47725 H97534 H48076 H48450 1^9631 AW300758 H03431R7B789 
AA954344 H77576 R96823 AM57100 N92845 N496a2 H42038 BE220698 BE220715 H995S2 AA701624 N74173 R54704 H79520 H72923 
H03256 BE261919 AA769633 AA4B0310 AA507454 AA910586 AI203723 AW1(M7S W25611 W25071 T88980 H03513 T77S89 R99156 
WffiOas R97470 AA702275 T77551 AA91 1952 H82956 NB3673 AA283672 

AA009647 AA131254 AA374293 AyV9544(teH04410AW6062B4 AA151166 BE1574S7 BE157501 HM384 W46»1 AW563674 H04021 H01532 
AA190933 K03231H59605 H01642AA852876AA1137S8AAS26S15AA746952AI161014AA099554RSS^ 



450375 83327.1 



TABLE 14C 

Pksv IbfauenunibercofresDOrKfinfltDan Eosprobeset ^ „ 

Sequence source, -nte? digit nunte in fife cokmi are &flbanklde^^ -Dunham Let a!,' refers to 8iepub8caSonenfifled The DNA 

sequence of human chromosome 22." Dunham L et A. Nature (1999) 402:489495. 

Strand: lnda2tesD^IA strand from v^exons were predicied. 

Kl,pcsifion: Indicales rada^o positions of pfecJBcJedexons. 

Pkey Ref Strand Ktj»si^ 

402075 8117407 Plus i21907-122035k122804-122921.124019-1241S1.124455.1246iai25672-126075 
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wo 02/086443 PCT/US02yi2476 
TABLE ISA: tafbrmaSaniaral sequences *n Table 16 

TaUe 1 » shoyys ae Seq ID H(By, ExAoft UnSgendD. 

Tate 15B sha* fl« acoesskm mmto tothase Pte/s 1^ 

d&onudeoSdeswerBdesignei Gene dusto ware confusing sapiences deri^ These sequences were cfcgteed based wsg^ 

Sfiari^ using Oustera^ and ABgim-*^ The Genbank accession numbers for sequences cornpnsmg each chstw 

'Aooes^on' column. 

Table 15C show (ha geinnicixwgoiwg for those Pkey^ 

sequence sotoce used far predkriioa. Nudeolidelocatnfi of each predicted exon are ^ 



SeqIONo: Sequence 10 number 
Ptey. Unique Eos probesetidenSfierniBnber 
ExAocn: Exemplar Accession 
UngenelDc Unigene number 
Unlijene Title: Unlgene gene title 



SeqlONa 



SeqtONo:U2 

SeqlDNo:3&4 

SeqlONo:5&6 

SfiqlDNo:7&8 

SeqlONo:9&10 

SeqIONo:11&12 

SeqlONo:13&14 

SeqlONo:15&16 

SeqIONo:17&l8 

SeqlDN(Kl9&20 

SeqtDNo:21&22 

SeqlDNc:23&24 

SeqlONo:2S&28 

SeqlDNo:27&28 

SeqlDMo:29&30 

SeqlDNo:31&32 

SeqlDNa33&34 

SeqlD Na3S&36 

SeqlONo:37&38 

SeqlDNo:39&40 

SeqIONo:41&42 

SeqtONo:43&44 

SeqlONo:45&46 

SeqtONo:47&48 

Seq ID No: 49 

SdqlONo:50&51 

Seq 10 No: 52 & 53 

SeqlDNo:54&55 

Seq ID No: 56 & 57 

Seq 10 No: 58 & 59 

Seq 10 No: 60 & 61 

Seq 10 No: 62 & 63 

SeqlONo:64&65 

Seq 10 No: 66 & 67 

Seq 10 No: 68 & 69 

Seq 10 Nu 70 & 71 

SeqlDNa72&73 

SeqlONo:74i75 

Seq ID No: 76& 77 

Seq 10 No: 78479 

Seq ID No: 80 & 81 

SeqlONo:82&83 

Seq 10 No: 84 & 85 

SeqlD No:B6&87 

5eqlDNo:88&89 

SeqlONo:90&91 

SeqlONo:92&93 

Seq to No: 94 & 95 

Seq ID No: 96 & 97 

Seq ID No: 98 £99 

Seq ID No: 100 & 101 

Seq ID No: 102 & 103 

Seq ID No: 104 4 105 

SeqlD No: 106 & 107 

Seq ID No: 108 & 109 

SeqlDNo:110&111 

SeqtDNo:112&113 

SeqlDNo:114&t15 

SeqIONo: 116 

Seq to No: 117 & 118 

SeqIOKo:119&120 

Seq to No: 121 & 122 

Seq to No: 123 & 124 

Seq to Nk 125 & 128 



Ptey 


ExAocn 


UnlgenelD 


Ungene Title 


410407 


X65839 


Hs.63287 


carbonic anhydrase IX 


412719 


AW016610 


Hs.816 


ESTs 


417034 


N^L006183 


Hs.80962 


neurotensin 


430486 


8B)62109 


Hs^41551 


diksida channel, caldum activated, fam 


407788 


BE514982 


Hs.38991 


S100 caldum-b&idng protein A2 


407788 


BE514982 


Hs.38991 


S100 caldum-btBdtng protein A2 


4077B8 


BE514S62 


Hs^991 


S100 caldum-tmicfing pro^ A2 


407788 


BE514982 


Ks.38991 


S100 caidunv^nxfir^ protein A2 


439285 


AL133916 




hypothec protein F1J20(^ 


413753 


U17760 


Hs.75517 


lanvoin. beta 3 (nicein (1251(D). kaiinln 


120486 


AW368377 


Hs.137569 


tumor protdn 63 }d)a vi^ strong homolog 


425650 1 


! NM.001944 


Hs.1925 


desmogleln 3 (pemphigus vulgaris antigen 


412140 


AA219691 


Hs.73625 


RAB6 int^ading, Idnesin-lite (rabkines 


423673 


BE003054 


Hs.1695 


matnx metaHoproteinase 12 (macrophage 


452838 


U65011 


Hs.30743 


preferenfiaOy expressed antigen in mela 


418663 


AK001100 


Hs.41690 


desmoooOinS 


418663 


AK001100 


Hs.41690 


desmoooOin 3 


409532 


W74001 


Hs.55279 


serine (orcyst^) protdnase tnhlfalto 


429510 


AB024937 


Hs^11092 


LUNX pioffitn; PUUNC (palate lung and nas 


406690 


M29540 


HS.220S29 


cardnoembryonic aniigen-reialed cdl ad 


431846 


BE019924 


Hs^71580 


uroptatdnlB 


418830 


BE513731 


Hs.88959 


hypothetical protein hdGC4816 


424098 


AFQ77374 


Hs.139322 


SfliaQ pfoSneflch proldn 3 


443648 


AI08S377 


Hs.143610 


ESTs 


311034 


BE567130 


Hs.311389 


ESTs, Higrdy suniJar to NKGD_HUMAN WAai- 


408522 


AI541214 


Hs.46320 


Sm^l proline^ protein SPRK (human. 


422158 


L10343 


Hs.1 12341 


protease inhitiltor 3, skin-derived (SXAL 


435505 


AF200492 


HS.211238 


inteiteuIun-1 homoiog 1 


417366 


BE185289 


Hs.1076 


smaO prolina-rtch protmn 18 (oorrafin) 


431958 


X63629 


Hs.2877 


cadherin 3, type 1, P-^adherin (placenta 


441020 


W79283 


Hs^5952 


ESTs 


423217 


NM.000Q94 


Ks.1640 


coDagen, type VII, alpha 1 (epidemiolys 


429538 


BE182592 


Hs.1 1261 


sm^ proHne-rid) protsln 2A 


448733 


Nr4.00S629 


Hs,187958 


soluta canier family 6 (neuiotransmifie 


444371 


BE540274 


Hs^39 


tbrtdieadtjoxMI 


444371 


BE540274 


Hs^39 


torKheadbaxMI 


444371 


BE540274 


nSJfJ9 


fnrUiaaft hnv Ml 


422168 


AA5e6894 


Hs.112408 


S100 caldum^unding protein A7 (psorias 


422168 


AA5S6894 


Hs.1 12408 


S100 calciunMunding protein A7 (psoiias 


4292S9 


AA420450 


H&292911 


Ptakopt^n 


A2Sm 


BE382756 


Hs.169902 


solute carrier family 2 (fiadlitalBd gtu 


437044 


AIJ035864 


Hs.69517 


differenSafiy expressed in FanoonTs an 


423662 


AK001035 


Ks.1 30881 


B^ CLL/lymphoma 11A (zinc finger pro 


428484 


AF104032 


Hs.184601 


solute earner famiiy 7 (caliodc amino 


429211 


AR)52693 


Hs.ig8249 


gap juncScm pmt^, beta 5 (conneidn 3 


417389 


BE260964 


HS.8204S 


midkine (neurile grourth-promoting factor 


423634 


AW959S08 


H5.1690 


heparin-binding growth factor binding pr 
atada-talangiedasia group O-assodated 


417515 


L24203 


Hs.82237 


441362 


BE614410 


Hs.23044 


RAD51 (S. cerevisiae) homolog (E cdi Re 


425322 


U63630 


Hs.155637 


pmtein kinase, ONA-activated. cat^ 


449003 


X76342 


Hs.389 


alcohol dehydrogenase 7 (class IV). mu o 


431009 


BE149762 


Hs.48956 


gap iuncfion protein, beta 6 (oonnexm 3 


409103 


AF251237 


Hs.1 12208 


XAGE-1 protein 


417542 


J04129 


Hs.82269 


progestagen-assodated endomelrial piote 


428471 


X57348 


Hs.184510 


straSfin 


418004 


U37519 


Hs.87539 


aldehyde dehydrogenase 3 family, member 


414761 


AU077228 


Hs.77256 


enhancerof zeste (Drosophila} faomokig 2 


418203 


X54942 


H5.83758 


C0C28 protein lonase 2 


447343 


AA2S6641 


H&236894 


ESTs, l^simBar b S02392 alpha-Zfli 


437016 


AU076916 


H5^3S8 


guanine monpbosphate synthetase 


449230 


BE613348 


Hs.211579 


mdanomaodi adhesion molecute 


446989 


AK001898 


Hs.16740 


hypotteScal protein FU1 1036 


457619 


AA057484 


K5^5406 


ESTs. H^ simflar to unnamed protein 


424687 


J05070 


Hs.151738 


matrix metaitapmtftinasft9 (gdafinase 8 
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wo 02/086443 

SeqtOKa127&128 414430 

Seq!Dfte:l29&130 41W62 

Seq(Orjo:131&132 100668 

SeqIONo:133&134 458933 

S SeqU)No:13S&136 418478 

SeqIDNo:137&138 41847B 

SeqlON'a139&140 418478 

SeqIONo:141&142 418478 

SeqlONo:143&144 446269 

10 S&ilONo:14S&146 422765 

SeqlDNo:147&148 438461 

SeqlDNaUS&ISO 440325 

S8qIDN'o:151&152 439605 

SeqlONo:153&154 453884 

15 SeqlDNo:155&156 453884 

SeqlDNa157&158 453884 

SeqtOKo:159&160 453884 

ScqlDNa161&162 4048n 

SeqlDNo:153&l64 413129 

20 SeqlDNa165&156 413281 

SeqIDNa167&16B 444781 

SeqiONo:169&170 416819 

SeqtDKo:171&172 451320 

SeqtOto173&174 418543 

25 SeqlONa1758 176 454034 

Seq®Na177&178 425397 

S6qU)No:17g&180 415817 

SaqlDNo:181&182 415817 

SeqlDKo:183&184 415817 

30 SeqIDNo:185&186 415817 

SeqlDNo:187&188 415817 

SeqlDNo:189&190 419121 
SeqIDNo:191&192 
SeqlDNo:193&194 
35 SeqlONo:195&196 
Seq !ONo:197&198 
Seq iDNa 199 & 200 
Seq ID No: 201 & 202 
Seq ID No: 203 & 204 
40 Seq iO No: 205 & 208 

Seq ID Ho: 207 & 208 429038 

Seq ID No: 209 & 210 418878 

Seq [D No: 21 1& 212 418678 

Seq (0 No: 2134 214 131927 

45 Seq ID No: 215 & 215 428182 

Seq ID No: 217 & 218 427335 

Seq ID No: 219 & 220 409420 
Seq 10 No: 221 & 222 
Seq ID No: 223 & 224 
50 Seq ID No: 2254 226 

Seq ID No: 227 & 228 415569 

Seq ID No: 229 & 230 103312 

Seq ID No: 231 & 232 320843 

SeqlDNo:233 429065 

55 Seq ID No: 2348.235 446102 

Seq ID No: 236 & 237 330495 

Seq ID No: 238 413573 

Seq ID No: 2394 240 428479 

Seq D No: 241 A 242 428479 

60 SeqlONo:243 &244 332180 

Seq 10 No: 245 437915 

Seq ID No: 2464 247 441553 

Seq 10 No: 248 4 249 331692 

Seq ID No: 250 4 251 429413 

65 Seq ID No: 252 4 253 422283 

Seq ID No: 254 4 255 448357 
Seq ID No: 256 4 257 
Seq ID No: 258 4 259 

Seq ID No: 260 4 261 

70 Seq ID No: 262 4 263 424046 

SeqIDNo:264 4 265 439223 

SeqIDNo:2664267 429228 

Seq ID No: 288 4 269 409757 

Seq ID No: 270 4 271 411089 

75 Seq ID No: 272 4 273 43651 1 

Seq to No: 274 4 275 428969 

Seq ID No; 276 4 277 428969 

Seq 10 No: 278 4 279 428969 

Seq ID No: 280 4 281 428969 

80 Seq ID No: 282 407137 

Seq 10 No: 283 4 284 412723 

Seq ID No: 285 4 286 450701 

Seq ID Na 287 4 288 405770 

Seq1ONa2894290 439453 

85 Seq ID Na 291 4 232 414774 



PCTAJS02yi2476 



448993 
421817 
430393 
425057 
420462 
102963 
100576 
101175 



114346 
438956 
404440 



446292 
416209 
453922 



Ai346201 

BE001596 

105424 

AI638429 

U38945 

U38945 

U38945 

U38945 

AW263155 

AW4037Q1 

AA379597 

NM-Q03812 

W79123 

AA355925 

AA355925 

AA355925 

AA35592S 

AF2921Q0 

AAfi61271 

NM_014400 

U77735 

AW118072 

NM>.00S329 

NMJ00691 

J04Qa8 

U88967 

U86967 

U88957 

U88967 

U88967 

AA374372 

AI471630 

AF146074 

8E185030 

AA826434 

AF050147 

X02404 

X00355 

U82671 

AL023513 

NM_001327 

NM_001327 

AJ003112 

B£a86042 

AM48542 

Z15008 

AL137256 

W00847 

NWL005025 
Y12642 



AI7S3247 

AW168067 

U47924 

AI733859 

Y00272 

Y00272 

AF134160 

AI637993 

AA281219 

AI683487 

NM_014058 

AW411307 

N20169 

AF081497 

AA238776 

AF053306 

AFQ27866 

AW23829g 



NM_001898 

AA456454 

AA721252 

AF120274 

AF120274 

AF120274 

AF120274 

T97307 

AA64&459 

H39960 

BE2&4974 
X02419 



Hs.76118 

Hs.85266 

Hs.169610 

Hs24763 

Hs.1174 

Hs.1174 

Ks.1174 

Hs.1174 

Hs.14559 

Hs.1578 

Hsi199 

Hs.7164 

Ksi8561 

H&38232 

Hsi6232 

HS35232 

Hs.36232 

Hs.104613 
Hs^24 
Hs,119S0 
Ks.80205 

Hs.85g62 

Hs^5 

Hs,156346 

Hs.78867 

Hs.78867 

Hs.78867 

Hs.78867 

H&78867 

Hs.89626 

HsJ127 

Hs,108660 

Hsi41305 

Hs.1619 

Hs.97g32 

Hs.274534 

HS.3705B 

HS.3B980 

Hs.194766 

Hs.167379 

Hs.167379 

Hs.34780 

Hs.293317 

Hs.251677 

Hs.54451 

Hs.130489 

HS.13S0S6 

Hs.78589 

Hs.d185 

Hs.34744 

Hs.29643 

Hs.317e94 

Hs.71642 

Hs.149089 

Hs^562 

Hs.334562 

Hs.7327 

Hs.202312 

Hs.121296 

Hs.152213 

Hsi01877 

Hs.114311 

Ks.108923 

Hs.279682 

Hs.79078 

HsJ6708 

Hs.138202 

HS.2S0618 

Hs.326447 

Hs.123114 

Hs.214291 

Hs.291502 

Hs.194689 

Hs.194689 

Hs.194689 

Hs.194589 

Hs.335951 
HS.28B457 

HS.6S66 
Hs.77274 



ubiqiiEn carte39l4eraBia) esterase LI 
kit39nn,beta4 

0)44 aiSgen (bonno fusl^ 
RANbirxfingprataQi 
cjcfii>^ependent lonase 8)^jbllor 2A (ms 
(^c&Hlependent kinase Inhibtfeor 2A (ma 
cyciiKiependeRt kinase oihlbQar 2A (me 
cycEiHiependeflt kinase inhSiaar 2A (R» 
IiypagieOcal protein aJ10540 
bacubvsat lAP cepeakxMtfainiRg 5 (5ur 
HSPC150 proteb sbnSar to uUqui&vcon 
a&irdeorin and metaOopreleinase doma 
6 pratBin«Giiiited receptor 87 
KIAAD18ogen8pradi»t 
K1AA0186 gene product 
KIAAD185 gene product 
KIAA0186 gene product 
N!uL005365i{omo sapiens melanoma aniigen. 
RP42h(»no!og 
transaction factor BMAU 
GR-anchored metastasis-assodated prate 
pim-2 oncogene 

diacy^Iyceroi kinase, zeta (104kD) 
liy^maa synthase 3 
aldehyde deliydrogenasa 3 fatnly. member 
topdsomerase (DNA) 0 alpha (ITOkO) 
proton tyrosine pdusphatasa, ceoeptar4 
protein t/rosine phosphatase, receptor^ 
proton tyrosine phosphatase, receptar4 
protein tyrosine phosphatase, receptDr4 
protein lynsfaie pha^)halase, reoep(or-l 
paiaihyidd homnneJiks honnone 
KIAA0144 gene product 
ATP-binding cassette, sul>-family C (CFTR 
estrogen-fesponsive B t)OX protein 
achaele-scule complex (DrosophBa) homd 
chondromoduCn I precursor 
caldtonin-ralaled polypeptide, beta 
calcitonfnA:a!dtofniHeialed polypeptid 
melanoma anfigen. fansly A. 2 
seizure related gene 6 (mouseHflce 
cancer/testis anfigen (NY-ESO-l) 
cancei^tesSs aniigen (NY-ESO-l) 
doubleoortex; risseooepbaV. X4ihked (d 
ESTs, VteaWy similar to GGCl JtUMAN G ANT 
Ganfigen7B 

iaminin. gamma 2 (nicein (lOOkO). kalini 
ATPase. aminophosphdipid transporter^ 
Humai DNA sequence from done RP5-8S0E9 
NM-021048i1omo sapiens melanoma antigen, 
serine (or cysteine) proteinase inhibito 
lysosomal 

Homo sapiens mRNA: cDNA DKFZp547C136 (fr 
Homo sapiens cDNA FLJ13103 fis. done NT 
ESTs 

guanine nudsotlde binding protein (G pr 
ESTs 

odl diviskai cyde 2. 01 to S and 62 to 
cdl division cyde 2, G1 ts S and G2 to 
dauifin 1 

Homo sapiens done Nil NTeFa201 teratoca 
ESTs 

wingtess-type MMTV Integration site fam 
DESC1 proiein 

COCAS (ceH division cyda 45. Sxerevis 
RAB38, member RAS QAOogene family 

t^ C glycopretBin 
MA02 [rMc airest deScrent yeast h 
budding uninhibited by benzimidazQles 1 
serine (or cysteine) proteinase tnhibitD 
U116 binding protdn2 
ESTs 

cystaSnSN 

ceil diviston cyda Mike 1 (PtTSLRE pr 

ESTs 

artemin 



gb7eS3h05.s1 Soares fetal liver spleea 
hypdheScat protein AF301222 
hypdhefical protein XP_098151 (leudne- 
NM_002382Jtomo sapiens mdanoma aniigen. 
thyroid honnone receptor interador 13 
plasminogen acSvator. urokinasa 
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SeqlDHo:293& 294 424S29 

SeqIDNo:295 &2S8 437789 

SeqIDNo:297 &2S3 437789 

Seqa)Ko:299 &3(]0 437789 

SeqlDKo:301&3Q2 437789 

SeqlOKo:303& 304 437789 

SeqIDHo:305&3Qe 45396S 

Seqa)No:307&3C3 403478 

Seq(DNo:309 441525 

SeqlD)to:310&31t 434105 

SeqlONo:312& 313 428810 

SeqtDKo:314& 315 413891 

SeqIDNo:316&317 423934 

SeqIDNo:318& 319 409228 

SeqroNo:320&32l 425734 

SeqlDNo:322& 323 413582 

SeqlONo:324& 323 438403 

SeqlOHo:3%&327 403329 

SeqlONo:328&329 409893 

SeQtDNo:330&331 119073 

Seq[ONo:332& 333 113195 

SeqlDNo:334 & 335 102283 

SeqIONo:33S&337 101345 

Seq1DKo:33B&339 103280 

SeqlONo:340& 341 102012 

SeqlONo:342 & 343 105729 

SeqlONo:344& 345 134299 

SeqiONo:346&347 412719 

SeqIONo:348&349 42215B 

SeqlONo:350 & 351 128924 

SeqlDr{o:352&353 100485 

Seq ID No: 354 4355 419121 

Seq ID No: 356 4 357 409459 

SeqlDNo:358 & 359 330493 

SeqlDNo:36Q&361 417866 

SeqlDNo:362 &363 418113 

SeqlDNo:364&3K 437016 

SeqIDNo:366 &357 429612 

SeqlDNo:368 & 369 440704 

SeqlONo:370&371 431221 

Seq ID No: 3724 373 431565 

Seq ID No: 374 4 375 431565 

Seq ID No: 376 4 377 132354 

Seq ID No: 378 4 379 424441 

Seq 10 No: 380 4 331 103768 

Seq ID No: 382 4 383 417512 

Seq ID No: 384 4 385 425266 

Seq ID No: 386 4 387 424503 

Seq ID No: 388 4 389 400289 

Seq ID No: 390 4 391 418007 

Seq ID No: 392 4 393 418007 

Seq 10 No: 394 4 395 418738 

Seq ID No: 396 4 397 415138 

Seq 10 No: 398 4 399 418506 

Seq ID No: 400 4 401 423961 

Seq ID No: 402 4 403 414812 

Seq ID No: 404 4 405 417433 

Seq ID No: 406 4 407 417433 

Seq ID No: 408 4 409 422S67 

Seq ID No: 410 4 411 428227 

Seq ID No: 412 4 413 444381 

Seq ID No: 414 4 415 400303 

Seq ID No: 416 4 417 411789 

Seq ID No: 418 4 419 428698 

Seq ID No: 420 4 421 -450098 

Seq ID No: 422 4 423 421552 

Seq ID No: 424 4 425 452747 

Seq ID No: 426 4 427 450375 

Seq 10 No: 428 4 429 426215 

Seq ID No: 430 4 431 425247 

Seq ID No: 432 4 433 432201 

Seq ID No: 434 4 435 427585 

Ssq ID No: 436 4 437 442117 

Seq ID No: 438 4 439 431211 

Seq ID No: 440 4 441 447033 

Seq ID No: 442 4 443 447033 

Seq 10 No: 444 4 445 447033 

Seq ID No: 446 4 447 115522 

Seq 10 No: 448 4 449 410418 

Seq 10 No: 450 4 451 409041 

Seq ID No: 452 4 453 409041 

Seq 10 No: 454 4 455 452461 

Seq ID No: 456 4 457 412420 

Seq ID No: 458 4 459 416658 

Seq 10 Na 460 4 461 407811 





Ks.151^ 


gMamat&cystORe Bgase. csyyfic sub 


AIS81344 


Hs.127812 


ESTs, Weaxly amuar to T17330 oypotneo 


A1581344 


Hs.127812 


ESTs, Wesloy sonSar to 117330 hypoibeS 


AI581344 


Hs.127812 


ESTs, Weasy sonlar to T17330 hypoineo 


A1K1344 


te.127812 


ESTs. WeaSdy siiniar to T17330 hypdheS 


A1581344 


Hs.127812 


ESTs, WeaUy sindar to T1 7330 hypoM 


AA847843 


HSJ62711 


Rtot^ group (nootttstooe tioQinQSO 




NMJl22342Kamo sapans kinesui protain 9 


AW241867 


Hs.127728 


ESTs 


AW952124 


Ks.13094 


presefulms associaled rhondxw^fite pro 


AFQSS23& 


Hs.193788 


fitric oxide synSiase 2A CiRducible, hep 


ABQ23173 


HSw75478 


ATPase.aassVl, type lis 


U89995 


Hs.159234 


foildiedd box E1 (thynsd banscnpSon f 


R16811 


HsJ22010 


ESTSt Weaoy sonuar to 21Q92ouA B oeO 


AP056209 


Hs.1 59396 


pepSciyfglycina alpba-amUating mooooxyg 


AW295647 


Hs,71331 


hypdhefical pnddn MGC5350 


M8066O7 


Hs.292205 


ESTs 




unnamed (KOtetn pnxlud (Homo safsens) 


AW247090 


Hs.57101 


piinidtfornosofne matitsnance deSctent (S. 


eE245360 


Hs^79477 


i^«ts eiythroblastosts viius E26 oncogen 


H83265 


HS.8S81 


ESTs, ^inlar to S41044 duofnosom 


AW161552 


Hs.83381 


guanine nudeoGde binding protdn 11 


NWL005795 


HS.1S2175 


cakstonin teceptor-fike 


U84722 


Hs.76206 


cadt^ 5. ^ 2. VH-cadhenn (vasada 


BE259035 


Hs.1 18400 


snged (DF0soph3a}-Ito (sea urcl^ 


H46612 


HS.2S3615 


Honn sapiens HSPC285 mRNA, panial 00$ 


AW5B0939 


Hs37199 


ootnplanent camponent Clq leceptor 


AW016610 


Hs.816 


ESTs 


U0343 


Hs.1 12341 


protease inh&ttor3, sHiHieftved (SKAL 


BE279383 


H$.26S57 


p)akophi&n3 


T19006 


Hs.10842 


RAN, member RAS Qocogene taniQy 


AA374372 


Hs.69626 


pamBiyrold bomion»fi!ce homme 


D86407 


Hs,54481 


low density fipoprotsin reoeptor-relatod 


M27826 




endogenous retroviral protease 


AW067903 


Ks£2772 


collagen, type XI. a!pha1 


AI272141 


H&83484 


SRY (sex detemtinlng region Y)-hox 4 


AU076916 


Hs.5398 


guanine monphosphate synthetase 


AF062649 


HS.2S2587 


l»&j{taiy tumor-translbnning 1 


M&9241 


Hs.162 


insu6n4ike growth factor binding prate 


AA449015 


HS266145 


SRB7 (suppressor of RNA polymerase B, ya 


AF161470 


HS.260S22 


butyrate^nduced b^nsoipl 1 


AF161470 


Hs.260622 


butyrato-tnduced transcript 1 


6E185289 


Hs.1076 


small praHneHich protein IB (oomifin) 


X14850 


Hs.1470g7 


H2A histone fiamily, member X 


AF086009 


HS296398 


gb:Homo sapens filU lengUi insert cDNA 


X76534 


Hs.82226 


glycoprotein (transmembrane) nmb 


J00077 


HS.15S421 


alpha^topFotein 


NM.0022Q5 


H5.149609 


Intsgrin. alpha 5 (fibronedin receptor. 


X07820 


Hs.2258 


matrix metaOaprotelnase 10 (stromelysin 


M13^ 


Hs.83169 


nofaix metalbproteinase 1 (IntarstiSd 


Ml 3509 


Hs.83169 


matrix metalloproteinase 1 (InterstiM 


AW388533 


Hs.6682' 


sdute carrier family 7, (caBonic amino 


C18356 


Hs.295944 


tissue factor pathway inhibitor 2 


AA084248 


Hs.85339 


G protein-coupled receptor 39 


01 3666 


Hs.1 36348 


periGstin (0$F-2os} 


X72755 


Hs.77357 


monoldne Induced by gamma interferon 


BE27Q266 


Hs.82128 


5T4oncofiBtaItrophob{astg}ycopiOlein > 


BE270266 


Hs.82128 


5T4 oncofetal trophobiasl glycoprotein 


L32137 


Hs.1 584 


carfSage oi^omeric matrix protein (pse 


AA321649 


Hs.2248 


sm^ inducttiie cytoldne subtemHy 8 (Cy 


BE387335 


Hs.283713 


ESTs, Weakly similar to S64054 hypotheti 


AA24275B 


H5.79136 


UV-I protein, estrogen regdaled 


AF245505 


Hs.72157 


AdOcan 


AA852773 


Hs.334838 


K1AA1866 protein 


W27249 


Hs.8109 


hypothetical protein FU21060 


AF026692 


H5.105700 


secreted frizzled-relaled protein 4 


BE1538S5 


HS.614S0 


Ig superfandy receptor LNIR 


AA009547 




a distntegrin and metallopnteinase doma 


AW963419 


Hs.155223 


stsnntocaicin 2 


NM.005940 


Hs.155324 


matrix metaltoprot^ase 11 (stromelysin 


A1S38613 


H5.298241 


Transmembrane protease, serine 3 


031152 


H&179729 


cdlagen. type X. alpha 1 (Schmid metaph 


AW864964 


Hs.128899 


ESTs; tvpoihefical protein lor IMAGE:447 


M86849 


Hs.323733 


gap junction protein, beta 2. 26kO (conn 


AI357412 


Hs.157601 


ESTs 


A1357412 


Hs.157601 


ESTs 


Al3o7412 


ilm. 4CTCA1 

HS<1d/bUi 


CCTe 

cots 


BC614387 


Hs.333893 


c4^yc target JP01 


D31382 


Hs.63325 


transmembrane protease, serine 4 


AB033025 


Hs.sooai 


Hypothetical protein. XP^051860 0<WA119 


AB033025 


Hs.50081 


Hypothetical protein, XP_{K1860 0<IAA119 


N78223 


Hs.108106 


transcrfption factor 


A1035668 


Hs.73853 


bone morphogenetic protein 2 


U03272 


Hs.79432 


libdifin 2 (congenital contractual ara 


AW190902 


Hs.40098 


cysteine knot superfamily 1. BUP antagon 



186 



wo 02/086443 

SeqiONa452&4o3 437852 

SeqtONo:454 & 455 402075 

SeqIOKa466&467 421110 

SeqtOKa:48S&469 451668 

Seq ID to 470 a 471 451568 

5eqlOKo:472 &473 451S68 

SeqIDNa474 &475 «2282 

S9qlDNa476 A 477 425852 

Seq[DKa478& 479 439733 

SeqlDtto:480&48t 427747 

S6q{OKo:482 &4a3 42Q231 

SdqtDKo:484 & 485 405932 

SeqIONo:485 & 487 405932 

Seq[DKo:488&489 444342 

SeqiDto^&491 421379 

Seq(ONo:492 &4S3 417079 

SeqrDNa494& 495 430S90 

Seq ID No: 496 4497 419721 

SeqlONa4S3&499 444471 

SeqlDNaSOO&SOl 4130S3 

SeqlDNo:502& 503 4338W 

Seq(ONo:SQ4&S05 452401 

Seq!DNo:ai5& 507 452401 

SeqIDNo:508 & 509 450001 

SeqlDMo:510& 511 410407 

SeqIDNaS12&S13 309931 

SeqiONo:514& 515 412719 

SeqIONo:516& 517 417034 

SeqlDNo:5]8&5ig 430488 

SeqIDNo:520 &521 413753 

SeqtDNo:S22&523 425650 

SeqlDNa524&525 423873 

SeqlDNo:526 &527 418663 

SeqlDNo:526 &529 418863 

SeqlDNa530&S31 429610 

SeqlDNa532 & 533 405690 

SeqlONo:534 &535 431646 

SeqlONo:536&537 422158 

SeqlDNo:538&S39 431958 

Seq 10 No: 540 8. 541 437044 

Seq ID No: 542 4 543 428484 

SeqlDNo:544&545 429211 

Seq ID No: 546 8 547 417389 

Seq ID No: 548 8 549 431009 

Seq ID No: 550 8 551 417542 

Seq ID No: 552 8 553 449230 

Seq ID No: 554 8555 410555 

Seq ID No: 556 8 557 410555 

SeqtDhto:558 & 559 424687 

SeqlONo:560&S€1 418462 

Seq to No: 562 8563 410274 

SeqtDNo:S64&565 439606 

Seq ID No: 5668 567 404877 

SeqlDNo:S68&S69 444781 

Seq 10 No: 570 8 571 418543 

Seqa)Ko:S72&573 415817 

Seq H) No: 574 8875 415817 

Seq ID No: 576 8577 415817 

Seq ID No: 578 8 579 415817 

Seq (D No: 580 8 581 415817 

Seq 10 No: 582 8583 415817 

Seq ID No: 584 8 585 421817 

Seq 10 No: SBS 8 587 418578 

Seq (0 No: 588 8 589 418678 

Seq ID No: 590 8 591 409420 

Seq ID No: 592 8593 332180 

Seq ID No: 594 8 595 408790 

Seq ID No: 5968 597 408790 

Seq ID No: 5988 SS9 439223 

Seq ID Na 6008 601 409757 

Seq 10 No: 602 8 603 428969 

Seq ID No: 604 8 605 426969 

SeqlONa6068607 428969 

Seq 10 No: 608 8 609 428989 

Seq to No: 610 8 611 450701 

Seq ID No: 612 8 613 450701 

Seq ID No: 814 8 615 414774 

Seq ID Na 616 8 617 407944 

Seq ID No: 618 8 619 407944 

Seq ID No: 620 8 621 457489 

SeqIDNo:622 8 623 429547 

Seq©No:6248625 407242 

Seq(DNo:628 & 627 407242 

Seqa)No:628&629 407242 

Seq ID Ko: 630 8 631 4440(» 





Hs.256897 


Aj250717 


^te.13S5 


Z43948 


HS326444 


Z43948 


H&a26444 


243948 


HsJ28444 


AF019225 


Hs.114309 


AKD01504 


Hs.159651 


SE246502 


H&9598 


AW411425 


H&16Q655 


fiSSSSBSZ 


H&323494 


NM 0143S8 


Ks.10887 


Y1S221 


K&103982 


US5590 


Ks.81134 


X54232 


Hs.2699 


Mu 001650 


Hs.288650 


A^32Q684 


Hs.11217 




Hs,75184 


AI034361 


Hs,135150 


NM_007115 


HS.2S352 


NM_007115 


Hs3352 


NM-001044 


Hs.406 


X65839 


Hs.63287 


AW341683 




AW016610 


Hs^16 


MM 008183 


'Hs.80962 




Hs^41S51 


U17760 


HsJ5517 


Mil nni^ 


Hs.19^ 


ftP003054 


Hs.1695 


AKfmiQO 


Hs.41690 


/UVUUI IW 






Hs^11092 


M29540 


Hs.220529 




Hs 271580 


LI 0343 


Ks. 11 2341 


X53629 


Hs.2877 


Al 035884 


Hs,69517 


AF1 04032 






Hs.198249 


BE260964 


Hs.82045 


BE149762 


Hs.48956 


J04129 


HsJ2289 


6^13348 

DCWI<N*fW 


HsJ11579 


U92649 


HSJ64311 


U92649 


H5.64311 


J05O70 


Hs.151738 




Hsi5266 


AA381807 


Hs£1762 


WJ9123 


H&S8561 


NM_014400 


H&11950 


NfuL005329 


Ks.85962 


U88987 


H5.7BB67 


1188967 


Hs.78867 


U88957 


Hs.78867 


U88967 


Hs.78867 


U88967 


Hs.78867 


Moaqg? 

\JOU9U( 


Hs.78867 


Mr •*»Ovi*> 








MM nM'iT? 
NM_wl Otf 






Hs.54451 




Hs.7327 














MM flOISSA 


Hs 193114 


APi 50574 


R<: 194889 




Hs.194689 


AF1 20274 


Ks.194669 


AF120274 


Hs.194689 


H39960 


Hs.288467 


H39960 


Ks.288467 


X02419 


Hs,77274 


R34008 


Hs.239727 


R34008 


Hs.239727 


A^IS 


Hs.127179 


AW009166 


Hs^6 


M18728 




|yt18728 




M18728 






HS.10Q86 



ESTs.V\i'baklysimaartodl3G50111 Pisa 
EN^%0000251056^Plasnia menibrane cainnn 
cafliep^E 

caffDaseadcScpnrieml 
cariilage addc protein 1 
apoCpopreianl 

de^ receptor 5, TNF superfamay msniber 
sema tiomxi, immunogUsdin (tonain {l9}> 
serineAhreonine kinase 12 
Ptedidedcatioo efOux pump 
C150flO305:gil3806122feblAAC6919ai| {AFO 
C150003(S:gil380612^lAAC©19ai| (AFO 
sonilar to lysosom&^ssodatsd membrane 
smafl indudtite cytotdne subfanrnTy B (Or 
mterieukin 1 receptor antagonist 
glyin^ 1 
aquapQnn4 
KIAAQS77 protein 

chitinase 3^ 1 {carriage glyooprote 
lung typa-l ceO membrane-^s o dated gly 
tumor necnsc factor, alphaiaduced pro 
tumor neoDsis factor, alpha4nduced pro 
solute caii^ fanfly 6 (oeuotransniltte 
caitomcaotiydraseiX 

gbJidl3d0lJtl Soa8SLNFk.TJ3BC_S1 Homos 
ESTs 

neuntensin 

diloridediannel. catdum activated, Cam 
lamtnin, beia 3 (mcein (12SkO], kaHnin 
desmoglan 3 (pemphigus vulgaris antigen 
matrix m^aOoproteinase 12 (macrophage 
de5mooQ]Iin3 
desmooQ!Iin3 

LUNX protein; POJNC {paiaSs long and nas 
cardnc^mbiyonic aaGgen-related cell ad 

uroplatdnlB 

protease inhibitor 3b shin-derived (SKAL 
cadharin 3, type 1 , P-cadherin (placenta 
differentlaliy expressed in Fanconfs an 
solute earner famijy 7 (cafiomc amino 
gap junction protein, beb 5 (connexin 3 
midkjne (neurite growOv-promot^g factor 
gap junc^ protein, bete 6 (connenn 3 
progestagen-assodkeri endometrial prote 
melanoma cell adhesion motecule 
adislntegrin and metaliopFOteinase doma 
a disintegrin and metalkqiroteinase doma 
matrix m^proteinase 9 CgelaGnase B 
integiin, bete 4 
hypoxia-lndudbte protein 2 - 
G protein^xn^iled rocepter 87 
NM J05365:Homo sapiens melanoma antigen. 
GPf-anchored metastasis-assodated prate 
hya!uronan synthase 3 
protein tyrosine phosphatase, reoeptor-t 
protein tyrosine phosphatase, receplor-l 
protein tyrosine phosphatase, receptor-t 
protein tyrosine phosphatese, receptor-t 
protein ^rosine phosphatese, receptor-t 
protein tyrosine phosphatase, recepter-t 
ATP-binding cass^ sub-famOy C (CFTR 
cancer/tesfis antigen (NY^SO-I) 
canceiAestis antigen (NY-ESO-1) 
lanibiin. gamma 2 (nicem (100ld5). kaltiri 
cteudini 

neurotrophic tyrosine kinase, receptor, 

neurotrophic tyrosine kinase, receptor. 

ULt6 binding protein 2 

cystelinSN 

aitemin 

aitenvn 

artemin 

aitemtn 

hypothetical protein XP.098151 (teudne- 

hypotttefica] protein XPjOgSISI (leudne- 

(itesn^en acSv^. uroldnase 

desffl000lDn2 

desmoco!Sn2 

cryptic gene 

ESTs 

gbi^uman lunspedfic onssreaefing anSg 
gbiluman non^edlic crossieac&ng anQg 
gbrHuman mn^iedficcrossreacGng anSg 
type I tra ns me m brane protein Fo14 
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Seq[DNo:632 &633 
SeqlDNtr634 &6^ 
Seqn)fio:636 &637 
SeqIOKa638 &639 
SeqIDNoc640&641 
SeqtOfto:642 &643 
SeqlDKa:644&645 
Seq(OKo:64€&647 
SeqDK9:64S&649 
SeqlONK6SO&651 
SeqlDNo:6S2&653 
SeqlDNo:654 &655 
Seql0^to:656&657 
SeqIOr4a658&659 
SeqU>Nac680&661 
S6qIONbc662&663 
SeqlONa664 &665 
Seq(ON'o:656 &667 
Seq!DKQ:668 &668 
SeqlDNae70&671 
S8qlDNa672&673 
SeqlONo:674&675 
SeqlDNa:676&677 
SeqIONo:G7S&679 
SeqlDNo:68Q&681 
SeqIDNae82&683 
SeqU}Ko:684&685 
SeqlDNa686 &687 
SeqlONo:668 &689 

TABLE 15B 

Ptay: Uraque Eos probeset identiiier nuniber 
CAT number Gene duster number 
Accession: Genbank accession numbeis 



PCTAJS02/12476 



429597 


NKLO03316 




aifeintegrmapdinelafriproiftgasedcgna 


4221(S 


S73265 


Hs.1473 


gastriiHBteasL'ig pCffiJe 


419235 


AW470411 


Hs.283433 


nsurobui'm 


449048 


Z45051 


Hs.??9?0 


simiar to S68401 (cstfla) Sbnosd bulls 


413216 


AU076718 


Hs.164021 


snaD indodlde qrtiddne subCand^ B (Cy 


431462 


AW583872 


KS.2S6311 


granii^lte aeuraendocite pepSds precu 


448243 


AW359771 


HS52620 


inte9nn,b^8 


426427 


M866S 


Hs.169340 


TTK protein idnsse 


445537 


AJ245671 


Ks.12844 


EGF-&e-do(nan, nudtipte 6 


422273 


AF072873 


Hs.11^8 


frizzled (Drosopttila^ homofog 6 


428450 


N!UL014791 


Hs.184339 


K1AA0175 gene product 


446619 


AU076&43 


H&313 


secreted phosphioprotan 1 (osteoponfin. 


453392 


U23752 


Hs,329S4 


SRY (sexdetemxning region YVbia 11 


426514 


BE616633 


Hs.170195 


bone morphogeneSc protdn 7 (osteogenic 


425775 


U25128 


Hs.1594Sg 


paralhyroid hsnnone receptor 2 


425776 


U2512B 


Hs.159499 


paraaQrroid honnone receptor 2 


431515 


NM_012152 


Hs.258583 


endotheBal dlsrenfialion, tysQphospha 


419452 


U33635 


Hs^2 


PTK7 protein tyreslne idnase 7 


432653 


N62(a6 


Hs.293185 


ESTs, Weakly siaSar to JC7328 amino ad 


432653 


mo5Q 


Hs^ISS 


ESTs, Weakly sinalar to JC7328 amino ad 


432653 


N62096 


Hs^l85 


ESTs, Weakly similar to JC7328 amino ad 


432653 


N62096 


Hs^185 


ESTs, Weakly similar to JC732B amino ad 


410001 


AB04103& 


H5S7771 


kaODffeinll 


42SS01 


AW0437S2 


Hs.2d3816 


ESTs 


408369 


R38438 


HS.18257S 


solute carnerfamiiy 15 (H7?? transport 


445413 


AA15m2 


Hs.12677 


CGI-147 protein 


422424 


AJ186431 


Hs^5638 


prostate di&rentiafion tector 


428330 


L22S24 


Hsj22SS 


matrix mdafloprotdnase 7 (matnlysin. 


420610 


A1683183 


HSJ9348 


di$tal4ess homeo box 5 



Pkey 
309931 
330493 
439285 



CAT Number 
AW34t683 
33264J 
47055J 



450375 83327.1 
451320 86S76J 



Accession 

M27826 R7841 6 AA307645 AW957879 AW957800 AA633S29 H03662 

AL133916N79113AF086101 N76721 AW950828AA364013 AW95S684AI346341 A1B674S4 N54784 AI655270 A1421279AW014B82 
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AWl 18072 AI631982 T1 5734 AA224195 A1701458 W20198 Fffi326 AAS90570 N90552 AW071907 A1671352 A137S892 T03517 RB826S 
AI124088 AA224388 AI084316 AI354686 T33652 A1140719 AI72021 1 T03490 AI372837 Tl 5415 AW205836 AA630384 T03515 T33230 
AA017131 AA443303 T33623AI2225S6T33511 T33785A(419606 055612 



TABLE 15C 

Pkey: Unkjue number corresponding to an Eos probeset ^ ^ ^. . 

Ref: Sequence source. The 7digitmimbere in this column are Genbank Identifier (Gl) nurabeis. "Ounhaml. e*al.'rafeis tothe puMcaBonanlitted Tne DMA 

sequence of human chromosome 22." Dunham L et ai„ Nature (1 999) 402:48949& 

Strand: Indicates ON A strand from which exo ns were predicted. 

Ntjxjsilion; Indicates nucleotide positions of predicted exons. 

Pkey Ref Strand Ntjiosition 

402075 8117407 Rus 121907.122035,122B04.122921.124019.124161,124455-124610.125572-126076 

403329 8516120 Plus 96450-96598 ' 

403478 9958258 Plus 116458-116564 

404440 7528051 Plus 80430^1581 

404877 1519284 Rus 1095-2107 

405770 2735037 Plus 61057-62075 

405932 7767812 Muius 123525-123713 
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Table 16 



Seq ZD NO: 1 DIM sequence 

Nucleic Acid Accession 8: NM_001216 

Coding sequence: 43.. 1422 

1 11 21 31 41 51 

i I I I I I 

GCCOGTACAC ACOGTGTGCT GGGACACCCC ACaCTCAGCC GCATQGCTCC CCTGTGCCOC €0 

AGCCCCTGGC TCCCTCTGTT GATOCOGGCC CCTGCTCCAG 6CCTCACTGT GCAACTGCTG 120 

CTGTCACTCC TGCTTCTGAT GCCTGTCCAT CCCCAGAGGT TGCCCCGGAT GCAGGAGGAT 180 

TCCCCCTTGG GAGGAGGCTC TTCTGGGGAA GATGACCCAC TGGGCGAGGA GGATCTGCCC 240 

ASK3AAGAGG ATTCAOCCAG ACAOGAGGAT GCAOCGGGAG AOGAGGATCT ACCTGGAGAG 300 

GAGGATCTAC CTCGAGAGGA GGATCTACCT GAAGTTAAGC CTAAATGAGA AGAAGAGGGC 360 

TCCCTGAAGT TAGAGGATCT AOCTACTGTT GAGGCTCCPG GAGATC CTCA AGAACCCCAG 420 

AATAATGCXX ACAGGGACAA AGAAOGGGAT GACCAGAGTC ATTGGOGCTA TGGAGGOGAC 480 

(XGCCCTGGC CCOGGGTGTC CCCAGCCTGC GOGGGCCGCT TOCAGTCOCC GGTGGATATC S40 

QGCCCCCAGC TOGCCGCCTT CTGCOOGGCC CTGCGCCCCC TCGAACTCCT GGGCTTCCAG 600 

CTCCCGCCGC TCCCAGAACT GOGCCTGCGC AACAATGGCC ACAGTGTGCA ACTGACCCTG 660 

CCTCCTGGGC TAGAGATGGC TCTGGGTCCC GG60GGGAGT ACOQGGCTCT GCAGCTGCAT 720 

CTGCACTGGG GGGCTGCAGG TCGTOOGGGC TOGGAGCACA CTGTO GAAG G OCACOGTTTC 780 

CCT6CCGAGA TOCACGTGGT TCACCTCAGC ACCGCCTTTG CCA6AGTT6A OGAGGCCTTG 840 

GGGCGCCOGG GAGGCCTGGC CGTGTTGGCC GCCTTTCTGG AGGAGGGCCC GGAAGAAAAC 900 

AGTGCCTATG AGCAGTTGCT GTCTOGCTTG GAAGAAATCG CTGAGGAAGG CTCAGAOACT 960 

CAGGTCCCAG GACTGGACAT ATCTGCACTC CTGCCCTCTG ACTTCAGOCG CTACTTCCAA 1020 

TATGAGGGGT CTCTGACTAC AO06GCCTGT G0CGAG6GT6 TCATCTGGAC TGTGTTTAAC 1080 

CAGACAGTCA TGCTGAGTGC TAAGCAGCTC CACACCCTCT CTGACACCCT GTGGGGACCT 1140 

GGTGACTCTC GGCTACAGCT QAACTTCOGA GOGACGCAGC CTTTGAATGG GCGAGTGATT 1200 

GAGGCCTCCT TCCCTGCTGG AGTGGACAGC AGTCCTCGGG CTGCTGAGCC AGTCCAGCTG 1260 

AATTCCTGCC TGGCTGCTGG TGACATCCTA GCCCTGGTTT TTGGCCTCCT TTTTGCTGTC 1320 

ACCAGCGTCG CGTTCCTTGT GCAGATGAGA AGGCAGCACA GAAGGGGAAC CAAAGGGGGT 1380 

GTGAGCTACC GCCCAGCAGA GGTAGCCGAG ACTGGAGCCT AGAGGCTGGA TCTTGGAGAA 1440 

TGTGAGAAGC CAGCCAGAGG CATCTGAGGG GGAGOOGGTA ACTGTOCTGT CCTGCTCATT ISOO 
ATGCCACTTC CTTTTAACrG CCAAGAAATT TTTTAAAATA AATATTTATA AT 

Seq ID NO: 2 Protein sequence: 
Protein Accession «: NP_001207 

1 11 21 31 41 51 

111)11 

MAPLCPSPHIi PLLIPAPAPG tiTVQLLLSLL LLMPVHPQRL PRMQEDSPLG GGSSGEEDPL 60 

GEEDLPSEED SPREEDPPGE EDLPGEEDLP GEEDLPEVKP KSEEEGSLKL EDLPTVEAPG 120 

DPQEPQNNAH RDKEGDDQSH WRYGGDPPWP RVSPACAGRF QSPVDIRPQL AAFCPALRPL 180 

ELLGFQLPPIj PELRLRNNGH SVQLTLPPGL ETIALGPGREY RALQLHUIWG AAGRPGSEHT 240 

VEQQtFPAEI HWHLSTAFA RVDEALGSPG GLAVLAAFLE EGPEENSAYE QLLSRLEEIA 300 

EEG5ETQVPG LDISALLPSD FSRYFQYEGS IiTTPPCAQGV IWTVFNQTVM LSARQLHTLS 360 

DTZMGFGDSR LQLNFRATQP LNGRVIGASF PAGVDSSPRA ABFVQUTSCL AAGDILALVF 420 
CLLFAVTSVA FLVQMRRQHR RGTKGGVSYR PAEVAETGA 

Seq tD NO: 3 DNA sequence 

Nucleic Acid Accession #: BC013923 

Coding sequence: 438-1391 

1 11 21 31 41 51 

i 1 i t I I 

AGCGGGGTTG TCTATTAACT TGTTCAAAAA GTATCAGGAG TTGTCAAGGC AGAGAAGAGA 60 

GTGTTTGCAA AAGGGGGAAA GTAGTTTGCT GCCTCTTTAA GACTAGGACT GAGAGAAAGA 120 

AGAGGAGAGA GAAAGAAAGG GAGAGAAGTT TGAGCCCCAG GCTTAAGCCT TTCCAAAAAA 180 

TAATAATAAC AATCATOGGC GGOGGCAGGA TCGGCCAGAG GAGGAGGGAA GCGCTTTTTT 240 

TGATCCTGAT TCCAGTTTGC CTCTCTCTTT TTTTCCCCCA AATTATTCTT OGCCTGATTT 300 

TCCTCGCGGA GCCCTGCGCT CCCQACACCC COGCCCGCCT CCCCTCCTCC TCTCCCCC06 360 

CCCGOSGGCC CCCCAAAGTC CCGGCCX3GGC CGAGGGTCGG CGGCCGCCX5G CGGGCCGGGC 420 

COGOGCACAG CGCCCGCATG TACAACATGA TGGAGACGGA GCTGAAGCOG CCGGGCCCGC 480 

AGCAAACTTC GGGGGGOGGC GGCGGCAACT CCACOGCGGC GGCGGCCGGC GGCAACCAGA 540 

AAAACAGCCC GGACCGCGTC AAGCGGCCCA TGAATGCCTT CATGGTGTGG TCCOGOGGGC 600 

AGOGGCGCAA GATG6CCCAG GAGAACCCCA AGATGCACAA CTOGGAGATC AGCAAGCGCC 660 

TGGGCGCCGA GTGGAAACTT TTGTCGGAGA CGGAGAAGCG GCCGTTCATC GAOGAGGCTA 720 

AGOGGCTGCG AGCGCTGCAC ATGAAGGAGC ACCOaGATTA TAAATACOGG CCCCGGCGGA 780 

AAACCAAGAC GCTCATGAAG AAGGATAAGT ACAOGCTGCC CGGCGGGCTG CTGGCCCCCG 840 

GOGGCAATAG CATGGCGAGC GGGGTCGGGG TGGGCGCCGG CCTGGGCGOG GGCGTGAACC 900 

AGOSCATGGA CAGTTAOGCG CACATGAACG GCTGGAGCAA CGGCAGCTAC AGCATGATGC 960 

AGGACCAGCT GGGCTACC06 CAGCACCCGG GCCTCAATGC GCACGGCGCA GCGCAGATGC 1020 

AGGCCATGCA CCGCTAGGAC GTGAGCGCCC TGCA0TACAA CTCCATGACC AGCTOGCAGA 1080 

OCTACATGAA CGGCTOGCCC ACCTACAGCA TGTCCTACTC GCAGCAGGGC ACCCCTGGCA 1140 

TGGCTCTTGG CTCCATGGGT TCGGTGGTCA AGTCOGAGGC CAGCTCCAGC CCCCCTGTGG 1200 

TTACCTCTTC CTCCCACTCC AGGGCGCCCT GCCAGGCXXX; GGACCTCCGG GACATGATCA 1260 

GCATGTATCT CCCCGGOGCC GAGGTGCCGG AACCCGCCGC CCCCAGCAGA CTTCACATGT 1320 

CCCAGCACTA CCAGAGGGGC OCGGTGCCGG GCAOGGCCAT T AAOGG CACA CTGOCCCTCT 1380 

CACACATGT6 A6GGC0GGAC AGCGAACTGG AGGGG06AGA AATTTTCAAA GAAAAAOGA6 1440 

GGAAATGGGA GGGGTGCAAA AGAGGAGAGT AAGAAACAGC ATGGAGAAAA C0CG6TA0GC 1500 

TCAAAAAAAA AAAAAAAAAA AAAATCCCAT CACCCACAGC AAATGACAGC TGCAAAAGAG 1560 

AACACCAATC CCATCCACAC TCACGCAAAA ACCGCGATGC CGACAAGAAA ACTTTTATGA 1620 

GAGAGATCCT GGACTTCTTT TKGGGGGACT ATTTTTGTAC AGAGAAAACC TGGGGAGGGT 1680 

GGGGAGGGCG GGGGAATGGA CCTTGTATAG ATCTGGAGGA AAGAAAGCTA CGAAAAACTT 1740 

TTTAAAAGTT CTAGTGGTAC GGTACGAGCT TTGCAGGAA6 TTTGCAAAAG TCTTTACCAA 1800 

TAATATTTAG ACCTAGTCTC CAAGOBAOGA AAAAAATGTT TTAATATTTG CAAGCAACTT 1860 

TT6TACAGTA TTTATOGAGA TAAACATGGC AATCAAAATC TCCATTGTTT ATAAGCTGAG 1920 
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AATTTGCCAA TATTTTTCAA GGAGAGGCTT CTTCCTE3AT TTTGATTCTG CAGC TCSftAAT 1980 

TTAGGACAGT TGCAAACGTG AAAAGAAGAA AATTATTCAA ATTTGGACAT rrTAATTGTT 2040 

TAAAAATTGT ACAAAAGGAA AAAATTAGAA TAAGTACTGG OSAACXIATCT CTGTGGTCTT 2100 

GTTTAAAAAG GGCAAAAGTT TTAGACIG?A CTAAATTTTA TAACTTACTG TTAAAAGCAA 2160 

AAATCGCCAT GCAGGTTGAC AGCXSTTGCTA ArPTATAATA GCTTrrGTTC GATCCCAACT 2220 

TTCCATTTTG TTCAGATAAA AAAAACCATG AAATTACTGT GTTTGAAATA TTT TCTTATG 2230 

GrmrrAATA tttctgtaaa ttcattgtga tattttaagg ttttcccccc tttattttcc 2340 

GTAGTTGTAT TTTAAAAGAT TOGGCTCTGT ATTATTTGAA TCAGTCTGCC GAGAATC CAT 2400 

GTATATATTT GAACTAATAT CATCCTTATA ACAGGTACAT TTTCAACTTA AGTTTTTACT 2460 

OCATTATCCA CAGTTTGAGA TAAATAAATT TTTGAAATAT GGACACTGAA AAAAAAAAAA 2520 

AAAAAAACAA AACAAAAAAA CAAAAAACAA AAACAGAAAA AAC3UVAAAAA AAAACAAAAC 2580 

CACAACACAA AAACAAAAAA AAAAAAAAGA AACAAACACA CAACACAACA CAACACAAAA 2640 
CCACAACACA AACAACAACA CACAGAGGG 

Seq ID NO: 4 Procein sequence: 
Protein Accession S:CAA8a435.1 

1 11 21 31 41 51 

MYNKMETEIiK PPGPQQTSGG GGGNSTAAAA GGNQKNSPDR VKRPMNAFMV WSBGQHRKMA 60 

QENPKMHNSB ISKRIiGAEWK LLSBTBKRPP IDEAKRLRAL HMKEHPDYKY RPRWCTRrm 120 

KKDiOTLPGG LLAPGCaiSMA SGVGVGAGLG AGVNQRMDSY AHKN6HSKGS YSMKmLGY 180 

PQHPGUiAEG AAQMQPMHRY DVSALQYNSM TSSQTV«NGS PTYSMSYSQQ GTWSMAWSM 240 

QSWKSEASS SPPWTSSSH SKAPCQACa)!. HDMISHVLPG ABVPBPAAPS RLHMSQHYQS 300 
GPVPGTAING TLPLSHM 

Seq ID NO: 5 DNA sequence 
Nucleic Acid Accession «: U91618 
Coding sequence: 29-541 

1 11 21 31 41 51 

OGGACTTGGC TTGTTAGAAG GCTGAAAGAT GATGGCAGGA ATGAAAATCC AGCTT6TATG 60 

CATGCTACTC CTGGCTTTCA GCTCCTGGAG TCTGTGCTCA GATTCAGAAG AGGAAATGAA 120 

AGCATTAGAA GCAGATTTCT TGACCAATAT GCATACATCA AAGATTAGTA AAGCACATGT 180 

TCCCTCTTGG AAGATGACTC TGCTAAATGT TTGCAGTCTT GTAAATAATT TGAACAGCOC 240 

AGCTGAGGAA ACAGGAGAAG TTCATGAAGA GGAGCTTGTT GCAAGAAGGA AACTTCCTAC 300 

TGCTTTAGAT GGCTTTAGCT TGGAAGCAAT GTTGACAATA TACCAGCTCC ACAAAATCTG 360 

TCACAGCAGG GCTTTTCAAC ACTGGGAGTT AATCCAGGAA GATATTCTTG ATACTGGAAA 420 

TGACAAAAAT GGAAAGGAAG AAGTCATAAA GAGAAAAATT CCTTATATTC TGAAACGGCA 480 

GCTGTATGAG AATAAACCCA GAAGACCCTA CATACTCAAA AGAGATTCTT ACTATTACTG 54 0 

AGAGAATAAA TCATTTATTT ACATGTGATT GTGATTCATC ATCCCTTAAT TAAATATCAA 600 

ATTATATTTG TGTGAAAATG TGACAAACAC ACTTATCTGT CTCTTCTACA ATTGTGGTTT 660 

ArrGAATGTG TTTTTCTGCA CTAATAGAAA TTAGACTAAG TGTTTTCAAA TAAATCTAAA 720 
TCTTCAAAAA AAAAAAAAAA AAATG6G6CC GCAATT 



Seq ID NO: 6 Protein sequence: 
Protein Accession #: AABS0S64 

1 11 21 31 41 SI 

1 I I 1 t 1 

MMAGMKIQLV CMLLUAFSSW SLCSDSBEEM KALEADFLTN MHTSKISKAH VPSWKKTIiIiN 60 

VCSLVNNLNS PAEETGEVHE EELVARRICLP TALDGFSLEA MLTIYQLHKI CHSRAFQHHE 120 
LIQEDILDTG NDKNGKEEVI KRKIPYILKR QLYENKPRRP YILKRDSYYY 

Seq ID NO; 7 DNA sequence 

Nucleic Acid Accession ft: NM_006536.2 

Coding sequence: 109-2940 

I 11 21 31 41 51 

! I } i 1 i 

ACCTAAAACC TTGCAAGTTC AGGAAGAAAC CATCTGCATC CATATTGAAA ACCTGACACA 60 

ATGTATGCAG CAGGCTCAGT GTQAGTGAAC TGGAGGCTTC TCTACAACAT GACCCAAAGG 120 

AGCATTGCAG GTCCTATTTG CAACCTGAAG TTTGTGACTC TCCTGGTTGC CTTAAGTTCA 180 

GAACTCCCAT TCCTGGGAGC TGGAGTACAG CTTCAAGACA ATGGGTATAA TGGATTGCTC 240 

ATTGCAATTA ATCCTCAGGT ACCTGAGAAT CAGAACCTCA TCTCAAACAT TAAGGAAATG 300 

ATAACTGAAG CTTCATTTTA CCTATTTAAT GCTACCAAGA GAAGAGTATT TTTCAGAAAT 360 

ATAAAGATTT TAATACCTGC CACATGGAAA GCTAATAATA ACAGCAAAAT AAAACAAGAA 420 

TCATATGAAA AGGCAAATGT CATAGTGACT 6ACTGGTATG GQ6CACATGG AGATGATCCA 480 

TACACCCTAC AATACAGAGG GTGTGGAAAA GAGGGAAAAT ACATTCATTT CACACC TAAT 540 

TTCCTACTGA ATGATAACTT AACAGCTGGC TACGGATCAC GAGGCCGAGT GTTTOTCCAT 600 

GAATGGGCCC ACCTCCGTTG GGGTGTGTTC GATGAGTATA ACAATGACAA ACCTT TCTAC 660 

ATAAATGGGC AAAATCAAAT TAAAGTGACA AGGTGTTCAT CTGACATCAC AGGCATTTTT 720 

GTGTGTGAAA AAGGTCCTTC CCCCCAAGAA AACTGTATTA TTAGTAAGCT TTTTAAAGAA 780 

GGATGCACCT TTATCTACAA TAGCACCCAA AATGCAACTG CATCAATAAT GTTCATGCAA 840 

AGTTTATCTT CTGTGGTTGA ATTTTGTAAT GCAAGTACCC ACAACCAAGA AGCACCAAAC 900 

CTACAGAACC AGATGTGCAG CCTCAGAAGT GCATGGGATG TAATCACAGA CTCTGCTGAC 960 

TTTCACCACA GCTTTCCCAT GAATGGGACT GAGCTTCCAC CTCCTCCCAC ATTCTOGCTT 1020 

GTACAGGCTG GTGACAAAGT GGTCTGTTTA GTGCTGGATG TGTCCAGCAA GATGGCAGAG 1080 

GCTGACAGAC TCCTTCAACT ACAACAAGCC GCAGAATTTT ATTTGATGCA GATTGITCAA 1140 

ATTCATACCT TOGIGGGCAT TGCCAGTTTC GACAGCAAAG GAGAGATCAG AGCCCAGCTA 1200 

CACCAAATTA ACAGCAATGA TQATCGAAAG TTGCTGGTTT CATATCTGCC CAC CACTG TA 1260 

TCAGCTAAAA CAGACATCAG CATTTGTTCA GGGCTTAAGA AAGGATTTGA GGTGGTTGAA 1320 

AAACTGAATC GAAAAGCTTA TGGCTCTGTG ATGATATTAG TGACCAGCGG AGATGATAAG 1380 

CTTCTTGGCA ATTGCTTACC CACTGTOCTC AGCAGTGGTT CAACAATTCA CTCCATTGCC 1440 
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CTGGGTTCAT CTGCAGOOOC AAATCTGGAG GAATTATCAC 
TTCTTTGTTC CAGATATATC AAACTOCaAT AGCATGATTG 
TCTCGAACTG GAGACATTTT CCAGCAAXZAT ATTCAGCTTG 
AAACCTCACC ATCAATTGAA AAACACAGTG ACTGTGGATA 
ATGTTTCTAG TTACGTGGCA GGCCAGTGGT CCTCCTGAGA 
GGACXsAAAAT ACTACACAAA TAATTTTATC ACCAATCTAA 
TGGATTCCAG GAACAGCTAA GCCTGGGCAC TGGACTTACA 
TCTCTGCAAG CCCTGAAAGT GACAGTGACC TCTOGOGOCT 
GCCACTGTGG AAGCCrTTGT GGAAAGAGAC AGOCTOCATT 
TATGCCAATG TGAAACAGGG ATTTTRTCCC ATTCTTAATG 
GAGCCACAGA CTGGAGATCC TGTTA(33CTG A GACTCCT TG 
GTTATAAAAA ATGATGGAAT TTACTCGAGG TATTTTTTCT 
TATACCTTGA AAGTGCATST CAATCACTCT GOCAGCATAA 
CCAGGGAGTC ATGCTATGTA TGTACCAGGT TACACAGCAA 
GCTCCAAGGA AATCAGTAGG CAGAAATGAG GAGGAGOGAA 
AGCTCAGGAG GCTCCTTTTC AGTGCTGGGA GTTCCAGCTG 
CCACCATGCA AAATTATTGA CCTGGAAGCT GTAAAAGTAG 
TGGACAGCAC CTGQAGAAGA CTTTGATCAG GGCCAGGCTA 
AGTAAAAGTC TACAGAATAT CCAAGATGAC TTTAACAATG 
AAGCGAAATC CTCAGCAAGC TGGCATCAGG GAGATATTTA 
AOGAATGGAC CTGAACATCA GCCAAATGGA GAAACACATG 
GCAATACGAG CAATGGATAG GAACTOCTTA CAGTCTGCTG 
CCTCTGrTTA TTCCCCCCAA TTCTGATCCT GTACCTGCCA 
GGAQTTTTAA CAGCAATGGG TTTGATAGGA ATCATTTGCC 
CATACTTTAA GCAGGAAAAA GAGACCAGAC AAGAAA6AGA 
ATAAATATCC AAAGTGTCrr CCTTCTTAGA TATAAGACCC 
CATACTAACA AAGTCAAATT AACATCAAAA CTGTAT TAAA 
ATACAGATAA GATTTTTACA TGGTAGATCA ACAATTCTTT 
CCTTACACTT TGGCTATGAA CAAATAATAA AAATTATTCT 
GCAAAGGGAA GGGTAAAGTC GGACCAGTQT CAAGGAAAG7 
AATAGOXCA AGCAGAGAAA AGGAGGGTAG 6TCTGCATTA 
TCATTTAGTT ACTTTGATTA ATTTTTCTTT TCTCCTTATC 
TTTACATGAA GATCATGCTA TATTTTATAT ATGTAGCCCC 
CTTGCTATTT TC3TTATATAT ATTTCAGATG ACATCTCCCT 
TTTCACTGTA AGAGGTAACC TTTAACAATA TGGGTATTAC 
TTTATGACAA AGGTCTATTG AATTTATTTG TUTGTAAGTT 
TTTCTAAGTT TATTGCCTTG GGTTATTATG GAATCATA6T 
TACCTAGGAA A 

Seq ID NO: 8 Protein sequence: 
£»rotein Accession NP__006S27.I 
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MTQRSIAGPI 
IKEMITEASF 
GDDPYTLQYR 
KPFYIKGQNQ 
MFMQSLSSW 
TFSLVQAGDK 
SAQLHQINSK 
GDDKLLGHCL 
SRISSGTGDI 
PDPDGRKyyT 
AVPPATVEAF 
AGADVIK23DG 
XQMIZAPRKSV 
LTLSHTAPGB 
PQISTHGPEH 
LILKGVLTAM 



11 
I 

CMUCPVTLLV 
YLFNATKRRV 
GCGKEGKYIH 
IKVTRCSSDI 
EFCNASTHNQ 
WCLVU3VSS 
OORKLLVSYL 
PTVLSS6STI 
PQQHIQLBST 
NNFITNLTFR 
VERDSLHFPH 
lYSRYPFSFA 



21 

I 

ALSSELPFL6 
FFRNIKILIP 
FTPNFLLNDN 
TGIPVCEKGP 
EAPNLQNQMC 
KMAEADRLLQ 
PTTVSAKTDI 
HSIA^^SAA 
GENVKPHHQL 
TASLWIPGTA 
PVMIYANVKQ 
ANGRYSLKVH 



DFDQGQATSY 
QPNGETHESH 
GLIGIICLII 



EIRHSKSLC^ 
RiyVAIRAMD 
WTHKTLSRK 



31 
I 

AGVQLQDNGY 
ATWXANNNSX 
LTAGYGSRGR 
CPQENCIISK 
SLRSAWDVIT 
LQQAAEFYLM 
SICSGLKKGF 
FNXtEELSRLT 
KNTVTVDNTV 
KPQCWTYTLN 
GFYPILNATV 
VKHSPSISTP 
SVLGVPAGPE 
ZQDDFiniAIL 
RNSLQSAVSN 
XRADKKENGT 



Seq ID NO: 9 DNA sequence 

Nucleic Acid Accession Ui Eos sequence 

Coding sequence: 336-632 



1 
1 

CTCCCXITCAC 
CCTGGGTGGG 
GAGCTGGCAC 
CAGGGTTTGG 
CCAGTGGGGC 
GGGTCTGTCT 
CGCTGGCTGT 
AGCTGAGTAA 
AGAAAGTGGA 
AGCAGGTGGA 
ACTTCTTCCA 
TCTCTTGGGC 
TGTTGATAAT 
CTGGGAGATG 
TCTCCAAGGC 
ATTG6AAATC 
AAATACCA 



11 

I 

CCCGGTCCAG 
CTCAGGGGCT 
TCTCTGGGAG 
TGGGATCAGG 
CCACATATAA 
CTGCCACCTG 
GCTGGTCACT 
GGGGGAAATG 
TGAGGAGGGG 
CTTCCAGGAG 
GGGCTGCGCA 
CCAGGACTGT 
ATTTTAATTG 
AGGGCCTCCT 
CAGAGCTATG 
GAGATAGGTT 



21 
1 

GATGCCCAGT 
GCCCTTGACC 
GGAGGGGGCT 
TTGAGGCAGG 
ATCCTCACCC 
GTCTGCCACA 
ACCTTCCACA 
AAGGAACTTC 
CTGAAGAAGC 
TATGCPGTTT 
GACOGACGCr 
TGATGCCTTT 
CTCAGTGATG 
GGATCCTGCT 
CTTTAG6TCT 
GCTGACTTTT 



31 

! 

CCCCACGACA 
TGGCCTAGAG 
GGGAGGGAAT 
TTTGGTTTCX 
TGGGAGCCTG 
GATCCATGAT 
AGTACTCCTG 
TGCACAAGGA 
TGATGGGCAG 
TCCTGGCACT 
GAAGCAGAAC 
GAGTTTTGTA 
TTCCATAACC 
CCCTTCTGGG 
CAATTTTGGA 
ATTTTGTCAA 



GTCTTACAGG 

ATGCTTTCAG 
AAAGTACAGG 
ATACTGTGGG 
TTATATTATT 
CTTTTOGGAC 
CCCTGAACAA 
CCMCICAGC 
TTCCTCATOC 
CCACTGTCAC 
ATGATGGAGC 
CCTTTGCTGC 
GCACCOCAAC 
AGGGTAATAT 
AGTGGGGCTT 
GCCCCCACCC 
AAGAGGAATT 
CAAGCTATGA 
CTATTTTAGT 
CGTTCTCACC 
AAAGCCACAO 
TATCTAACAT 
GAGATTATCT 
TTATTATAGT 
ATGGAACAAA 
AT6GCCTT0G 
A7GCATTGAG 
TTGGGGGTAG 
TTAAAGTAAT 
TTGTTTTATT 
TAACTGTCTG 
TGTGCAGTAC 
TAATGCAAAG 
GCTAATGCTC 
CTTT6TCTCT 
TCTACTCOCA 
TATAGCCCCH 



41 

1 

KGLLIAINPQ 
IKQESYEKAN 
WWSEHMfLR 
LFKEGCTFIY 
DSADFHHSFP 
QIVBIHTPVG 
EWEKLNGKA 
GGLKPFVPDZ 
GNDTMFLV7W 
NTHHSLQALK 
TATVEPETGD 
AHSIPGSHAM 
PDVFPPCaCII 
VNTSKRNPQQ 
lAQAPLFIPP 



41 
1 

CCTCCCACTT 
CCCTCCCCCA 
GAGTGGGAAT 
TTAAAATGC3C 
GCTGCCnGC 
GTGCAGTTCT 
CCAAGAGGGC 
GCIX5CCCAGC 
CCTGGATGAG 
CATCACTGTC 
TCTT6ACTTC 
TTCAATAAAC 
CGGCTGGCTC 
CTCTGACTCT 
ATTtCAAACA 
ATAAAGATAT 



AGGTTTAAAG 
TAGAATTTOC 
TGAAAATGTC 
CAAOGACACr 
TGATOCTGAT 
AGCTAGTCTT 
TACCCATCAT 
T6TG00C0CA 
TGTGATGATT 

tgcc3«:agtt 

AGGTGCTGAT 

aaatggtaga 
cesumCTATT 

TCAGATGAAT 
TAGCOGAGTC 
TGATGTGTTT 
GACCCTATCT 
AATAAGAAIG 
AAATACATCA 
CCAGATTTCC 
AATT^TGn 
TGCOCAGGOG 
TATATTGAAA 
TGTGACACAT 
ATTATTATAA 
ACTACAAAAA 
TTTTTGTACA 
ATTAGAAAAC 
GTCTTTAAAG 
GAGGtTGGAAA 
TGTGAAGCAA 
AGGTTGCTTG 
CTCTTTACCT 
A6A6ATCTTT 
TCATACCGGT 
TCAAAGCAGC 
TATAATGCCT 



51 
1 

VPQfQNLISN 
VIVTDWYGAH 
NGVEDEYNND 
NSTQNATASI 
MNGTELPPPP 
lASFDSKGEI 
YGSVMILVTS 



1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2S20 
2560 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
330O 
3360 
3420 
3480 
3540 
3600 
3660 



QASGPFSZIL 
VTVTSRASNS 
PVTIiRLLDDG 
YVPGYTAKGN 
DLEAVKVEEE 
AGIREIFTFS 
NSDPVPARDY 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



SI 
I 

CCCACTGTGG 
GCTGGTGGTG 
GGCAAGAGGC 
AAGTTGGGGG 
TCrCCTTCCT 
CTGGAGCAG6 
GACAAGTTCA 
TTTGTGGGGG 
AACAGTGACC 
ATGTGCAATG 
CTGCCAT66A 
TTTTTTTGTC 
AGCTGGAGTG 
GCTGGAAATC 
GCAGGAAAAA 
TAAAAAAGGC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



Seq ID NO: 10 Protein sequence: 



191 



wo 02/086443 

Protein Accession Si NP_005969.l 

I 11 21 31 41 SI 

I I 1 1 I I 

KMCSSXjEQAL AVLVTTFHKY SGQBCaiXFKL SKQEMKELLB KEXiPSFVGEK VD5EGLKKLM 
GSU3ENSDQQ VDFQSYAVFL ALXTVMOIDF FQ6CFDRP 



Seq ID NO: 11 ENA sequence 

Rucleic Acid Accession S: Eos sequence 

Coding sequence: 336-626 



1 11 

1 I 
CTCCOCTCAC CCOGGTtXAG 
CXTGGGTGGG CTCAGGGGCT 
GAGCTGGCAC TCTCTGGGAG 
CAGGGTTTGG TGGGATCAGG 
CCAGTGGG6C CCACATATAA 
GGGTCTGTCT CTGCCACCTG 
CGCTGGCTGT GCTGGTCACT 
AGCTGAGTAA GGGGGAAATG 
ATTCOVQAGA ACCATGTGCT 
GAGACTTGAG AAACCAGAGC 
GGAGAAAGTG GATGAGGAGG 
GCAGCAGGTG GACTTCGAG6 
TGACTTCTTC CAGGGCTGCC 
GATCTCTTGG GCCCAGGACT 
TCTGTTGATA ATATTTTAAT 
TGCTGGGAGA TGAGGGCCTC 
TCTCTCCAAG GCCAGAGCTA 
AAATT6GAAA TGQAGATAGG 
GCAAATACCA 



21 31 

I I 

GATGCCCAGT CCCCA06ACA 
GCOCTTGACC TGGCCTAGAG 
GGAGGGGGCT GGGAGGGAAT 
TTGAGGCAGG TTTGGTTTCC 
ATCCTCACCC TGGGAGCCTG 
GTCTGCCACA GATCCATGAT 
ACGTTCCACA AGTACTCCTG 
AAGGAACTTC TGCACAAGGA 
GTGAGGGCCT TCCGAGTCCA 
CCAGAAGGGA AAAGTGATTG 
GGCTGAAGAA GCTGATGGGC 
AGTATGCTGT TTTCCTGGCA 
CAGACCGACC CTGAAGCAGA 
GTTGATGCCT TTGAGTTTTG 
TGCTCAGTGA TGTTCCATAA 
CTGGATCCTG CTCCCTTCTG 
TGCTTTAGGT CTCAATTTTG 
TTGCTGACTT TTATTTTGTC 



41 51 

I I 
CCTCCC3VCTT CCCACTGTGG 
CCCrcCCCCA GCTGGTGGTG 
GAGTGGGAAT GGCAAGA6GC 
TTAAAATGCC AAGTTGGGGG 
GCTGCCTTGC TCTCCTTCCT 
GTGCAGTTCT CTCGAGCAGG 
GCAAGAGGGC GACAAGTTCA 
GCTGCCCAGC TTT6TGGGGC 
TCPGTTTAAT CCTGTCATTG 
TCCCAAGATC ACACAGCACT 
AGOCTGGATG AGAACAGTGA 
CTCA.TCACTG TCATGTGCAA 
ACTCTTGACT TCCTGCCATG 
TATTCAATAA ACTTTTTTTG 
CCCGGCTGGC TCAGCTGQA6 
G6CTCTGACT CTCCTGGAAA 
GAATTTCAAA CACCAGCAAA 
AAATAAAGAT ATTAAAAAAG 



41 SI 

I I 
KBIjPSFVGHS repcavsafr 



Seq ID NO: 12 Protein sequence: 
Protein Accession #: Eos sequence 

I 11 21 31 

I I I I 

MMCSSLEQAL AVLVTTFHKY SCQEGDKFKL SKGEMXELUI 
VHLFNPVIGD IiRNQSPEGKS DCPKITQHWR KWMRRG 



Seq ID NO: 13 DNA sequence 

Nucleic Acid Accession Eos sequence 

Coding sequence: 58-354 

1 11 21 31 41 51 

i i i t I I 

GTGAGCTCAC CATGTGGGGG TGAGGCTGAG AGAAAACAAG TACACAGCCA CAGATCCATG 
ATGTGCAGTT CTCTGGAGCA GGCGCTGGCT GTGCTGGTCA CTACCTTCCA CAAGTACTCC 
TGCCAAGAGG GOGACAAGTT CAAGCTGAGT AAGGGGGAAA TGAAGGAACT TCTGCACAAG 
GAGCTGCCCA GCTTTGTGGG GGAGAAAGTG GATGAGGAGG GGCTGAAGAA GCTGATGGGC 
AGCCTGGAT6 AGAACAGTGA GCAGCAGGTG GACTTCCAGG AGTATGCTGT TTTCCTGGCA 
CTCATCACTG TCATGTGCAA TGACTTCTTC CAGGGCTGCC CAGACCGACC CTGA AGCAGA 
ACTCTTGACT TCCTGCCATG GATCTCTTGG GCCCAGGACT GTTGATGCCT TTGAGTTTTG 
TATTCAATAA ACTTTTTTTG TCTGTTGATA ATATTTTAAT TGCTCAGTGA TGTTCCATAA 
CCCGGCTGGC TCAGCTGGAG TGCTGGGAGA TGAGGGCCTC CT GGAT CCTG CTCCCTTCTG 
GGCTCTGACT CTCCTGGAAA TCTCTCCAAG GCCAGAGCTA TGCTTTAGGT CTC AATT TTG 
GAATTTCAAA CACCAGCAAA AAATTGGAAA TCGAGATAGG TTGCTGACTT TTATTTTGTC 
AAATAAAGAT ATTAAAAAAG GCAAATACCA 

Seq ID NO: 14 Protein sequence: 
Protein Accession NP_005969-1 

I 11 21 31 41 51 

) I 1 I i I 

MMCSSLEQAL AVLVTTFHKY SCQEGDKFKL SKGEMKBLLH KELPSFVG2K VDEEGLKKLM 
GSLDENSDQQ VDFQEYAVFL ALITVMCNOF FQGCPORP 



Seq ID NO: IS DNA sequence 

Nucleic Acid Accession tt: Eos sequence 

Coding sequence: 62-358 



1 11 21 

1 I I 

GGAGGGTGTG CCGCTGAGTC ACTGCCTGGG 
CATGATGTGC AGTTCTCTGG AGCAGGCGCT 
CTCCTGCCAA GAGGGCGACA AGTTCAAGCT 
CAAGGAGCTG CCCAGCTTTG TGGGGGA6AA 
GGGCAGCCTG GATGAGAACA GTGACCAGCA 
GGCACTCATC ACT6TCATGT GCAATGACTT 
CAGAACTCTT GACTTCCTGC CATGGATCTC 
TTTGTATTCA ATAAACTTTT TTTGTCTGTT 
ATAACCCGGC TGGCTCAGCT. GGAGTGCTGG 
TCTGGGCTCT GACTCTCCTG GAAATCTCTC 
TTTGGAATTT CAAACAOCAG CAAAAAATTG 



31 41 51 

I i i 

CATCTGGGCC TGGAACCTCG GCCACAGATC 
GGCTGTGCTG GTCACTACCT TCCACAAGTA 
6AGTAAGGGG GAAATGAAGG AACTTCTGCA 
AGTGGATGAG GAGGGGCTGA AGAAGCTGAT 
GGTGGACTTC CAGGAGTATG CTGTTTTCCT 
CTTCCAGGGC TGCCCAGACC GACCCTGAAG 
TTGGGCCCAG GACTGTTGAT GCCTTTGAGT 
GATAATATTT TAATTGCTCA GTGATGTTCC 
GAGATGAGGG CCTCCTGGAT CCTGCTCCCT 
CAAG6CCAGA GCTATGCTTT AGGTCTCAAT 
GAAATOGAGA TAGGTTGCTG ACTTTTATTT 



192 



wo 02/086443 

TGraVAATAft AfiAXATnM AAAOGCAAAT ACCA 



Seq ZD KO: 16 Protein sequence: 
Protein Accession ff: NP_005969.1 

1 11 21 31 41 51 

I 1 i i I 1 

KMCSSLBQAL AVLVTTFHKY SOQBGDKFKL SKGEMKELLR KELPSPVGEK VDEEGLKKLM 60 

GSLDEKSDQQ VDFQEYAVFL ALITVKQIDP FQGCPDRP 



Seq ID NO: 17 DNA sequence 

Nucleic Acid Accession S: Eos sequence 

Coding sequences 939-2372 

1 11 21 31 41 SI 

i I 1 i i I 

AAGACGGATT CTCAGACAAG GCTTGCAAAT GCCCCGGAGC CATCATTTAA CTGCACCGGC 60 

AGAATAGTTA CGGTTTGTCA CCCGACCCTC CCGGATCGCC TAATTTGTCC CTAGTGAGAC 120 

CrCGAGGCTC TGCCCGCGCC TGCCTTCTTC GTAGCTGGAT GCATATOTPG CTCX3GGGCAG 180 

CX3CGGGCGCA GGGCACGCGT TCX3CGCACAC CCTAGCACAC ATGAACACGC GCAAGAGCTG 240 

AACCAAGCAC GGTTT C CATT TCAAAAAGGG AGACAGCCTC TACCGOSATT GTAGAAGAGA 300 

CTGTCGTGTG AATTAGGGAC GGGGAGGGGT CGAAOGGAGG AAOGGTTCAT CTTAGAGACT 360 

AATTTTCTGG AGTTTCTGCC CCTGCTCTGC GTCAGCCCTC AOtSTCACTTC GCCAGCAGTA 420 

GCAGAGGOGG CGGOGGCGGC TCCCGGAATT GGGTTGGAGC AGGAGCCTCG CTGGCTGCTT 4B0 

CGCTCGOSCT CTAOGCGCTC AGTCCCCGGC GGTACCAGGA GCCTGGACCC AGGOGCCGCC 540 

GGCXaSGCXSTG AGGCGCCGGA GCCCGGCCTC GAGCTGCATA CGGGACCCCC ATTCGCATCT 600 

AACAAGGAAT CTGGSCCCCA GAGAGTCCCG OGAGOGOOGC OGGTCGGTGC CCGGCGCGCC 660 

GGGCC3VTCCA 60GACGGCOG COGOGGAGCT CCGRGCAGOG GTAGCGCCCC CCTGTAAAGC 720 

GGTTCGCTAT GCCGG6GCCA CTGTGAACCC TGCCGCCTGC OGGAACACTC TTCGCTCCGG 780 

ACCAGCTCAG CCTCTGATAA GCTCGACTOG GCACGCCCXX: AACAAGCACC GAGGAGTTAA 840 

GAQAGCCGCA AGCGCAGGGA AGGCCTCCCC GCACG6GTGG GGGAAAGOGG CCGGTGCAGC 900 

GCGGGGACAG GCACTCGGGC TGGCACTGGC TGCTAGGGAT GTCGTCCTGG ATAAGGTGGC 960 

ATGGACCXXX: CATGGCGCGG CTCTGGGGCT TCTGCTGGCT QGTTGTGGGC TTCTGGAGGG 1020 

CCGCTTTGGC CTGTCCCACG TCCTGCAAAT GCAGTGCCTC TOGGATCTGG TGCAGCGACC 1080 

CTTCTCCTGG CATOGTGGCA TTTCCGAGAT TGGAGOCTAA CACTGTAGAT CCTGA GAAC A 1140 

TCACCGAAAT TTTCATCGCA AACCAGAAAA GGTTAGAAAT CATCAACGAA GATGATGTTG 1200 

AAGCTTATGT GGGACTGAGA AATCTGACAA TTGTGGATTC TGGATTAAAA TTTGTGGCTC 1260 

ATAAAGCATT TCTGAAAAAC AGCAACCTGC AGC»JCATCAA TTTTACCCGA AACAAACTGA 1320 

CGAOTriGTC TAGGAAACAT TTCCGTCACC TTGACTTCTC TGAACTGATC CTGGTGGGCA 1380 

ATCCATTTAC ATGCTCCTGT GACATTATGT GGATCAAGAC TCTCCAAGAG GCTAAATCCA 1440 

GTCCAGACAC TCAGGATTTG TACTGCCTGA AT6AAAGCAG CAAGAATATT CCCCTGGCAA ISOO 

ACCTGCAGAT ACCCAATTGT GGTTTGCCAT CTGCAAATCT GGCCXSCACCT AACCTCACTG 1S60 

TGGAGGAAGG AAAGTCTATC ACATTATCCT GTAGTGTGGC AGGTGATCCG GTTOCTAATA 1620 

TGTATTGGGA TGTTGGTAAC CTGGTTTCCA AACATATGAA TGAAACAAGC CACACACAGG 1680 

GCTCCTTAAG GATAACTAAC ATTTCATCCG ATGACAGTGG GAAGCAGATC TCTTGrGTGG 1740 

CGGAAAATCT TGTAGGAGAA GATCAAGATT CTGTCAACCT CACTGTGCAT TTTGCACCAA 1800 

CTATCACATT TCTCGAATCT CCAACCTCAG ACXa^CCACTG GTGCATTCCA TTCACTGTGA 1860 

AAGGCAACCC CAAAGCA60G CTTCAGTGGT TCTATAAOGG GGCAATATTG AATGAGTCCA 1920 

AATACATCTG TACTAAAATA CATGTTACCA ATCACACGGA GTACCACGGC TGCCTCCAGC 1980 

TGGATAATCC CACTCACATG AACAATGCGG ACTACACTCT AATAGCCAAG AATGAGTATG 2040 

GGAAGGATGA GAAACAGATT TCTGCTCACT TCATGGGCTG GCCTGGAATT GACGATGGTG 2100 

CAAACCCAAA TTATCCTGAT GTAATTTATG AAGATTATGG AACTGCAGCG AATGACATCG 2160 

GGGACACCAC GAACAGAACT AATGAAATCC CTTOCACAGA CGTCACTGAT AAAACCGGTC 2220 

GGGAACATCT CIOGGTCTAT GCTGTGGTG6 TGATTGCCTC TGTGGTGGGA TTTTGCCTTT 2280 

TX3GTAATGCT GTTTCTGCrT AAGTTGGCAA GACACTCCAA GTTTGGCATG AAAGGTTTTG 2340 

TTTTGTTTCA TAAGATCCX:a CTGGATGGGT AGCTGAAATA AAGGAAAAGA CAGAGAAAGG 2400 

GGCTGTGGTG CTTGTTGGTT GATGCTGCCA TGTAAGCTGG ACTCCTGGGA CTGCTGTTGG 2460 

CTTATCCOGG GAAGTGCTGC TTATCTGGGG TTTTCTGGTA GATGTGGGCG GTGTTTGGAG 2520 

GCTGTACTAT ATGAA6CCTG CATATACTGT GAGCTGTGAT TGGGGAACAC CAATGCAGAG 2580 

GTAACTCTCA GGCAGCTAAG CAGCACCTCA AGAAAACATG TTAAATTAAT GCTTCTCTTC 2640 

TTACAGTAGT TCAAATACAA AACT6AAATG AAATCCCATT GGATTGTACT TCTCTTCTGA 2700 

AAAGTGTGCT TTTTGACCCT ACTGGACATT TATTGACTTA ATTGCTTCTG TTTATTAAAA 2760 

TTCACCTGCA AAGTTAAAAA AAAATTAAAG TTGAGAACAG GTATAAGTGC ACACTGAATA 2820 

GTCTAATCTA CATGTAACAC ATATTTTAGT GTGATTTTCT ATACTCTAAT CAGCACTGAA 2880 

TTCAGAGGGT TTGACTTTTT CATCTATAAC ACAGTGACTA AAAGAGTTAA GGGTATATAT 2940 

ACCATCACTT TGGGACTTGG TAGTATTATT AAAAGGTTAT TTCCTTCACT GTCAATAAAA 3000 

GTCCAAATGT nAGCTTAGG TCTGAGAGTC AAACAATGTT AAGGATTGTC TTAAAGTTCX 3060 

TTAGCCAGCA AAACAAAACA AAACAAAACA AACAAATGAA AAACGTTTAA AAAGAAGAAG 3120 

AAGAAAAAAA ACAAGAACAA GCAGCAACAG CTGTTTTGTT GGGGCTATAG ATTTAAGTTA 3180 

GGCATAGTCA ATTTCAGAAT AACTAAGAGT GGAATATATG CATATGGTCA AATTATAACC 3240 

TTGCCCTTTT TTATTTtXCC TCTGCGATCC ACCTGCTTTT TAGAAGTCTG CCGAGTGAGA 3300 

AGGCXACAGT ATCTCATGCT GTTTGCATTA CAGAACTGCA GLTITTCTAC TCTGAAAAGG 3360 

OCTGGGAGCA GAATGGCTGG CCTGCTGTGA GCAGGAGAGG AGATTCTAAG AAGGATAGTC 3420 

CCCCCTACAA CATACTGTCA TACTCCTGG6 TTTTCATGGG TAGGAAAGCT TGTCCTGACC 3480 

CCAGCAGCAA A6AG6TGGCA GGTGGCTAAT 6AATATATGC TTTATAATGr CCTTCTTCAT 3540 

TGCTGAGAGG GCAGCCTTAG AGCTGTGGAT TTCTGCATOC CCCCTGAGTC TGACCCATGG 3600 

ACACCTGTTT CATTCACTTT AGCATCACAG TGACCTTTGT ATGCTCTGTT CAGTCTGTGT 3660 

CAGGCAGTAT GCTTGTCCTG AAGAGAGGTT TGGCTATCCC CACCCCACCC CACCCCACCC 3720 

TGTTCCTTTT TTATCAGGAG GACTTCAGAG OCAGGCCXGC AGCATTTTGT TTGAAAACAC 3780 

AATCAGCTCT GACAGTTAGA CATGCACACA GAOGCCATAG CTGGATTGGA AACATTGATG 3840 

TTTTAAAAAT TTATTTTTTT TGGAAATAGT TGC3VCAAATG CTGCAATTTA GCTTTAAGGT 3900 

TCTATAGATT TTTAACTAGT CCAACACAGT CAGAAACATT GTTTTGAATC CTCTGTAAAC 3960 

CAAGGCATTA ATCTTAATAA ACCAG6ATCC ATTTAGGTAC CACTTGATAT AAAAAGGATA 4020 

TCCATAATGA ATATTTTATA CTGCATCCTT TACATTAfiCC ACTAAATACG TTATrGCTltS 4080 

ATGAA6ACCT TTCACAGAAT CCTATGGATT GCAGCATTTC ACTTGGCTAC TTCATAOOCA 4140 



193 
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TGOCTTAAAG AGOQGCAGTT TCTCAAAACC 
TOCTAACTCC ATTTGAATGT AAGGGCAGCT 
TCTGAATTCC CATTTTCTTG TTOSaSGCTA 

GATcrrrooc aaaggtgttg atttacaaag 

6AAAGAGAGA TGAAATTCAA GCTGTGAGCC 
6AGAATCAGC CATTTGGTAC AAAAAAGATT 
ATAGAAAGGC TATGGATTGT TTAAGAACTA 
AATAAAAAAA AAGGAATATT TGTACCCAAC 
TTTAAAATGG AGAGAAGTGG ACAGATAAGG 
CrOCTAOGGA ATGATGAAAA CACCAOGCTA 

Seq ZO MO: 18 Protein sequence: 
Protein Accession 8: CAAS3571 



PCTAJS02/12476 



AGAAACA3GC 
GGCCCCCAAT 
AATGACAGTT 
AGGCGAGCTA 



TTTAAAGCTT 
TTTTAAAGTG 
AGCTAGAAGG 
CCATTTAATA 
T 



OGCCAGTTCT 
GTGGGGAGGT 
TCTGTCATTA 
AXAGCAGAAA 
TCAGTATGGC 
TTATOTTATA 
TTCCAGACCC 
ATTGCAAGGT 
TATCAAAGAT 



CAAGTTrTTCC 
COSAACATTT 
CTTAGATTCC 
TCATGACCCT 
AAAGGTTCTT 
CCATGGAGCC 
AAAAAGGAAA 
AGATTTTTGT 
CAGTTGACAT 



4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 



1 
I 

MSSHIRWBGP 
NSVDPSNITB 
NFTRNKLTSL 
SKNIPLANIiQ 
KETSHTQGSL 
WCIPPTViaW 
LIAKNEYGKD 
DVTDKTGSEH 



11 
I 

AMARLIIGFCW 
IPIANQKRLE 
SRKHFRHIiDL 
IPNCGLPSAN 
RITNISSDDS 
PKPALQWFYM 
EKQISAHPKG 
LSVYAWVIA 



21 
1 

LWGFWRAAF 

SEhXLVGHPF 
LAAPNLTVEE 
GKQISCVAEN 
GAIUIESKyi 
WPGIDOGAHP 
SWGFCLLVM 



31 
1 

ACPTSCKCSA 
VGLHKLTIVD 
TCSCDIMWIK 
GKSITLSCSV 
LVGEDQDSVN 
CTKIHVTNHT 
NYPDVIYEDY 
LFLUCLASHS 



41 
I 

SRIUCSDPSP 
SGLKPVAKKA 
TLQEAKSSPD 
AGDPVPNMYW 
LTVHFAPTIT 
EYHGCLQLON 
GTAANDIGOT 
KFGNRSFVItP 



SI 
1 

GIVAFPRliEP 

FLIOISNLQHI . 

TQDLYGUIES 

DVGNIiVSKHM 

FLESPTSDHH 

PTHMNNGDYT 

TNRSNEIPST 

HKIPLDG 



Seq ID KO: Id DNA sequence 
Hucleic Acid Accession ft: NH_000228 
Coding sequence: 82-3600 

I 11 21 31 41 

I 1 I I 1 

GCTTTCAGGC GATCTGGAGA AAGAACGGCA GAACACACAG CRAGGAAAGG 
GGATCACCCC ATTGGCTGAA GATGAGACCA TTCrrTCCTCT TGTGTTTTGC 
CTCCTGCATG CCCAACAAGC CTGCTCCCGT GGGGCCTGCT ATCCACCTGT 
CTTGTTCGGA GGACCOGGTT TCTCCGAGCT TCATCTACCT GTGGACTGAC 
AGCTACT6CA OCCAGTATGG CGAGTGGCAG ATGAAATGCT GCAAGTGTGA 
CCTCACAACT ACTACAGTCA CCGAGTAGAG AATGTGGCTT CATCCTCCGG 
TGGTGGCAGT CCCAGAATGA TGTGAACCCT GTCTCTCTGC AGCTGGACCT 
TTCCAGCTTC AAGAAGTCAT GATGGAGTTC CAGGGGCCCA TGCOOGCOGG 
GAGCGCTCCT CAGACTTCGG TAAGACCTGG CGAGTGTACC AGTACCTGGC 
ACCTCCACCT TCCCTCGGGT CCGCCAGGGT CGGCCTCAGA GCTGGCAGGA 
CAGTCCCTGC CTCAGAGGCC TAATGCACGC CTAAATGGGG GGAAGGTCCA 
ATGGATTTAG TGTCTGGGAT TCCAGCAACT CAAAGTCAAA AAATTCAAGA 
ATCACAAACT TGAGAQTCAA TTTCACCAG6 CTGGCCOCTG TGCOCCAAAG 
CCTCCCAGCG CCTACTATGC TGTGTCCCAG CTCGGTCTGC AGGGGAGCTG 
GGCCATGCTG ATCGCTGCGC ACCCAAGCCT GGGGCCTCTG CAGGCCCXTTC 
CAGGTCCACG ATGTCTGTGT CTGCCAGCAC AACACTGCCG GCCCAA&TTG 
GCACCCTTCT ACAACAACCG GCCCTGGAGA CCGGCGGAGG GCCAGGACGC 
CAAAGGTGCG ACTGCAATGG GCACTCAGAG ACATGTCACT TTGACCCCGC 
QCCAGCCAGG GGGCATATGG AGGTGTGTGT GAC3VATTGCC GGGACCACAC 
AACTGTGAGC GGTGTCAGCT GCACTATTTC CG6AACCX3GC GCCOjGGAGC 
GAGACCTGCA TCTCCTGCGA GTGTGATCCG GATGGGGCAG TGCCAGGGGC 
CCAGTGACCG GGCAGTGTGT GTGCAAGGAG CATGTGCAGG GAGAGOGCTG 
AAGCGGGGCr TCACTGGACT CACCTACGCC AACCCGCAGG GCTGCCACGG 
AACATCCTGG GGTCCCGGAG GGACATGCCG TGTGACGAGG AGAGTGGGCG 
CTGCCCAACG TCGTGGGTCC CAAATGTGAC CAGTGTGCTC C CTAC CACTG 
AGTGGCCAGG GCTGT6AACC GTGIGOCTGC GACC06CACA ACTCCOCTCA 
CAACCAGTTC ACAGGGC3VGT GCCCTGTCGG GAAGGCTTTG GTGGCCTGAT 
GCAGCCATCC GCCAGTGTCC AGACCGGACC TAT6GAGACG TGGCXyVCAGG 
TGTGACTGTG ATTTCCGGGG AACAGAGGGC CCGGGCTGOS ACAAGGCATC 
CTCTGCCX3CC CTGGCTTGAC CGGGCCCCGC TGTGACCAGT GCCAGCGAGG 
OGCTACCOGG TGTGCGTGGC CTGCCACCCT TGCTTCCAGA CCTATGATGC 
GAGCAG6CCC TGCGCTTTGG TAGACTCCGC AATGCCACCG CCAGCCTGTG 
GGGCTGGAGG ACCGTGGCCT GGCCTCCCGG ATCCTAGATG CAAAGAGTAA 
ATCCGAGCAG TTCTCAGCAG CCCCGCAGTC ACAGAGCAGG AGGZGGCTCA 
GCCATCCTCT CCCTCAGGCG AACTCTCCAG GGCCTGCAGC TGGATCTGOC 
GAGACGTTCT CCCTTCCGAG AGACCTGGAG AGTCTTGACA GAAGCTTCAA 
ACTATGTATC AGAGGAAGAG GGAGCAGTTT GAAAAAATAA GCAGTGCTGA 
GCCTTCCGGA TGCTCAGCAC AQCCTACGAG CAGTCAGCCC AGGCTGCTCA 
GACAGCTCGC GCCTTTTGGA CCAGCTCAGG GACAGGCGGA GAGAGGCAGA 
CGGCAGGCGG GAGGAGGAGG AGGCACOGGC AGCCDCAAGC TTGTGGCCCT 
ATGTCTTCGT TGCCTGACCT GACACCCACC TTCAACAAGC TCTGTGGCAA 
ATGGCTTGCA CCCCAATATC ATGCCCTGGT GAGCTATGTC CCCAAGACAA 
TGTGGCTCCC GCTGCAGGGG TGTCCTTCCC AGGGCOGGTC GGGCCTTCTT 
CAGGTGGCTG AGCAGCTGG6 G6GCTTCAAT GOOCAGCTCC AGCXSGACX:AG 
AGGGCAGCCG AGGAATCTGC CTCACAGATT CAATCCAGTG CCCAGCGCTT 
GTGAGCGCCA GCCGCTCCCA GATGGAGGAA GATGTCAGAC GCACAOGGCT 
CAGGTCCXXM ACTTCCTAAC AGACCCCGAC ACTGATCCAG CCACTATCCA 
- GAGGCCGTGC TGGCCCTGTG GCTGCCCACA GACTCAGCTA CTGTTCTGCA 
GAGATCCAGG CCATTGCAGC CAGGCTCCCC AACGTGGACT TGGTGCTGTC 
CAGGACATTG CGCGTGCCCG CCX3GTTGCAG GCTGAGGCTG AGGAAGCCAG 
CATGCAGTGG AGGGCCA6GT GGAAGATGTG GTTGGGAAOC TG06GCAGGG 
CTGCAGGAAG CTCAGGACAC CATGCAAGGC AGCAG0C6CT CCCTTOGGCT 
AGGGTTGCTG AGGTTCAGCA GGTACTGCGG CCAGCAGAAA AGCTGGTGAC 
AAGCAGCTGG GTGACTTCTG GACACGGATG GAGGAGCTCC GCCACCAAGC 
GGGGCAGAGG CAGTCCAGGC CCAGCAGCTT GGGGAAGGTG CCAGOGAGCA 
GCCCAAGAGG GATTTGAGAG AATAAAACAA AAGTATGCTG AGTTGAAGGA 



60 
120 
180 
240 
300 
360 
420 



51 

I 

TCCTTTCTGG 
CCTGCCTGGC 
TGGGCACCTG 
CAAGCXrrGAG 
CrCCAGGCAG 
CCCCATGCGC 
GGACAGGAGA 
CAT6CIGATT 
TGOCGACTGC 
TGTTQGGTGC 
ACTTAACCTT 
GGTOGGGGAG 
GGGCTAOCAC 
CTTCTGTCAC 
CACCGCTGTG 
TGAGC3GCTGT 
CCATGAATGC 
TGTGTTTGCC 
OGAAQ6CAAG 
TTCCATTCAG 
TCCCTGTGAC 
TGACCTATGC 
CTGTGACTGC 
CTGCCTTTGT 
GAAGCTGGCC 
GCCCACAGTG 
GTGCAGOGCr 
ATGCOGAGCC 
AGGC03CTGC 
CTACTGCAAT 
GGACCTCCGG 
GTCAGGGCCT 
GATTGAGCAG 
GGTGGCCAGT 
CCTGGAGGAG 
TOSTCTCCTT 
TCCTTCAGGA 
GCAGGTCTCC 
GAGGCTG6TG 
GAGGCTGGAG 
CTCCAGGCAG 
TGGCACAGCC 
GATGGCGGGG 
GCAGATGATT 
GGAGACCCAG 
CCTAATCCAG 
OGAGGTCAGC 
GAAGATGAAT 
CCAGACCAAG 
GAGCCGAGCC 
GACAGTGGCA 
TATCCAGGAC 
AAGCATGACC 
COGGCAGCAG 
GGCATTGAGT 
COGGTTGGGT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2860 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 



194 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
S5 



WO 02/086443 

CAGAGrrCCA TGCTGGGTGA GCAGGGTGCC 
GAGCTGTTTG GGGAGACCAT GGAGATQATG 
CIGOGGGGCA GCXaGGCCAT CATGCTGOSC 
CTGGAGCAGft TCCETGRCCA CATCRATGGG 
TGCTACAGCT TCCAGCCCGT TGOCCCRCTC 
GKrTGQGTTG GAATGCTTrC CATCICCAGG 
CRCCMXXXT QGTGTGTACC TAGTAAOATT 
GGGACRGTTA CACTTGACAG ACAAAGATGG 
CTCTCAAGTC AAGGAAGCTG GGCTGGGCSC 
GSAATOCIGG ACCAAGCACA AAAACTTAAC 
AAAATCTTT6 G 

Seq ID KO: 20 Protein sequeace: 
Protein Accession 8: NP_0O0219 



PCTAJS02/12476 



GACAGGA7GA 
TCGGCGGACC 
C3GCGTGCTCT 
ATCTGCCGCX: 
AGACTTTCAT 
ACCCTGAGCT 
TGGAGATTGG 
TATOOOOOGC 
AAAAGTGAT6 



GTGTGAACAC 
AAGACATGGA 
TGACAGGACT 
ACTATGCCAC 
TTTGCTTTTG 
GCAGCCTAAA 
GCAGCTGAGC 
CATGCCATTG 
CTTTAGTTCT 
TAAAAATGAA 



AGAGGCAGAG 
GTTGGW3CTG 
GGAGAAGGST 
CTGCAAGTGA 
GTTGGGGGCA 
GTACAGCCTG 
CTGAGCCAAT 
AAACTAAGAG 
CCACTGGGGA 
AAGOCAAATA 



3420 
3480 
3540 
3600 
36S0 
3720 
3780 
3840 
3900 
3960 



1 
I 

MRPFFX«LCFA 



MEFQGPMPAG 
NARIiNGGKVQ 
VSQLRLQGSC 
PWRFAEGQDA 
HyFRNSRPGA 
TYANPQGCHR 
CACDPHNSPQ 
TEGPGCDKAS 
RLRNATASLH 
TLQGLQIiDLP 
AYEQSAQAAQ 
TPTFNKLOGH 
GFNAQLQRTR 
DPDTDAATIQ 
RLQAEAEEAR 
VLRPAEKLVT 
IKQKYAELKD 
KLRSADLTGL 



11 
I 

LPGIiLHAQQA 
SRQPHNYYSH 
MLIERSSDFG 
LNLMDLVSGI 
FCHGHADRCA 
EEOQRCDCNG 
SIQETCISCB 
CDCNILGSRR 
PTVQPVHRAV 
GRCLCRPGLT 
SGPGLSDRGI* 
IiEEETIiSLPR 
QVSDSSRLU3 
SRQKACTPIS 
QMIRAAEBSA 
EVSEAVLALW 
SRAHAVEGQV 
SMTKQLGDPW 
RLGQSSMLGB 
EKRVEQIRDB 



21 
I 

CSRGACyPPV 
RVENVASSSG 
KTWRVYQyLA 
PATQSQKIQE 
PKPGASAGPS 
HSETCHPDPA 
CDPDGAVPGA 



PCREGFGGLM 
GPRCDQCORG 
ASRILDAKSK 
DLESLDRSFN 
QLRDSRREAB 
CPGELCPODN 
SQIQSSAQRL 
LPTDSATVLQ 
EDWGNLRQG 
TRMEEIiRHQA 
QQARIQSVKT 
IKGRVIiYYAT 



31 
1 

GDItLVGRTRF 
PMRHHQSQND 
ADCTSTFPRV 
VGEITNLRVN 
TAVQVHDVCV 
VFAASQGAYG 
PCDPVTGQCV 
CLCItPIIWGP 
CSAAAIRQCP 
YCNRYPVCVA 
lEQIRAVLSS 
GLLTMYQRICR 
RLVRQAGGGG 
GTAC6SRCRG 
ETQVSASRSQ 
KMNBIQAIAA 
TVALQEAQDT 
RQQGAEAVQA 
EAEBZiFGETM 
CK 



41 
I 

LRASSTCGLT 
VNPVSLQLDL 
RQGRPQSWQD 
FTRLAFVPQR 
CQHNTAGPNC 
GVCDNCRDHT 
CKEHVQGERC 
KCDQCAPYHW 
QRTYGDVATG 
CHPCPQTYDA 
PAVTEQEVAQ 
EQFEKISSAD 
GTGSPKLVAL 
VLPRAGGAFL 
MEEDVRRTRL 
RLPNVDLVLS 
MQGTSRSLRL 
QQLAEGASEQ 
EMMDRMKDMB 



51 
I 

KPETYCTQYG 
DRRFQLQEVM 
VROQSLPQRP 
GYHPPSAYYA 
ERCAPFYMNR 
EGKNCEROQL 
DLCKPGFTGIi 
KLASGQGCEP 
CRACDCDFRG 
DLREQALRFG 
VASAIIiSIiRR 
PSGAFRMLST 
RLEMSSLPDL 
MAGQVAEQLR 
LIQQVRDFLT 
QTKQDIARAR 
IQDRVAEVQQ 
ALSAQEGFER 
LELLRGSQAI 



Seq ID NO: 21 DNA sequence 

Nucleic Acid Accession NM_003722 

Coding sequences 145-1491 



I 
I 

TCGTTGATAT 
ACAGTACTGC 
AAAGAAAGTT 
CCAGACGTTT 
ATTGACTTGA 
AGCATGGACT 
AOGAACCTGG 
AGTCCCTATA 
CCCAGCTCCA 
CCAGGCCCGC 
TGGACGIATT 
CAGATCAAGG 
AAAAAAGCTG 
GAATTCAACX3 

catgccx:agt 
ccaccccagg 
tgtgttggag 

GGGCAAGTCC 
AGGAAGGCGG 
GATGGTACGA 
AAAGGAAGAT 
GAAATGCTGT 
ATTGAAACGT 
CTTTCAGCCT 
GAOGTCTTCT 
TCTATATTTT 
TGTGTGTGGG 
CCCAACTGCT 
TTACAAGAAA 
GAACCACTGT 
GAAAGGGGCA 
AATTCACAGG 
AAAAAAGTTG 
CCCTTTTAAT 
TACTGCTGGG 
TTTGTGAGAA 
GCTGTGTACC 
CATGAAACCC 
CTCATTTTGT 
TGTTTACCAT 
AATTTGCTTA 
CTGATACTGT 
AGACGTGTTA 



11 
I 

CAAAGACAGT 
CCTGACCCTT 
ATTACCGATC 
TCCAGCATAT 
ACTTTGTGGA 
GTATCCGCAT 
GGCTCCT6AA 
ACAGAGACCA 
CCTTCGATGC 
ACAGTTTCGA 
CCACTGAACT 
TGATGACCCC 
AGCACGTCAC 
AGGGACA6AT 
ATGTAGAAGA 
TTGGCACTGA 
GGATGAACCG 
TGGGCCGACG 
ATGAAGATAG 
AGCGCCCGTT 
CCCCAGATGA 
TGAAGATCAA 
ACAGGCAACA 
GCTTCAGGAA 
TTAGACATTC 
AAGTGTGTGT 
TGTGTATCTA 
CAAAGGCACA 
GGATGTTTTC 
GTTTGTCTGT 
TTAAGATGTT 
GAAGCTTTTG 
TTATTGTCTG 
GCTGGTCATG 
CAGCGAGGTG 
CTTGCATTAT 
TGCCTCTGCX; 
TGGAAGACCT 
GCTTTTAATA 
TATTCAAAGC 
ATTAGAGCTT 
TCAGTGCATT 
AAATCAGCAC 



21 
I 

TQAAGGAAAT 

ACATCCAGCX; 
CACCATGTCC 
CTGGGATTTT 
TGAACCATCA 
GCAGGACTCG 
CAGCATGGAC 
OGOGCAGAAC 
TCTCTCTCCA 
CGTGTCCTTC 
GAAGATVACTC 
ACCTCCTC»G 
GGAGGTGGTG 
TGCCCCTCCT 
TCCCATCACA 
ATTCACGACA 
CCGTCCAATT 
CTGCTTTGAG 
CATCAGAAAG 
TCGTCAGAAC 
TGAACTGTTA 
AGAGTCCCTG 
GCAACAGCAG 
TGAGCTTGTG 
CAAGCCCCCA 
GTTGTATTTC 
GCCCTCATAA 
AAGCCACTAG 
TGCAGATTTT 
GAGCTTTCTG 

tattggaacc 
agcaggtctc 
t6cataagta 
taataatatt 
atcattacca 
ttgtgtcctc 
actgtatgtt 
actacaaaaa 
gaaa6acaaa 

TCAAAATAGA 

CTATCXXTCA 
TAGCCAGGAG 
TCCTGGACTG 



31 



41 



5X 



GAATTTTGAA 
TTTOGTAGAA 
CAGAGCACAC 
CTGGAACAGC 
GAAGATGGTG 
GACCTGAGTG 
CAGCAGATTC 
AGCGTCAOSG 
TCACCCGCCA 
CAGCAGTCGA 
TACTGCCRAA 
GGAGCTGTTA 
AAGCGGTGCC 
AGTCATTTGA 
GGAAGACAGA 
GTCTTGTACA 
TTAATCATTG 
GCCOGGATCT 
CAGCAAGTTT 
ACACATGGTA 
TACTTACCAG 
6AACTCATGC 
CAGCACCAGC 
GAGCX:CCGGA 
AACGGATCAG 
CATGTGTATA 
AGAGGACTTG 
TGAGAGAATC 
GTATCCTTAG 
TTGTTTCCTQ 
CTTTTCTGTC 
AAACTTAAGA 
AGrrGTAGGT 
6CAA6TAGTA 
AAAGTAATCA 
CCCTCATGTG 
GGCATCTGTT 
AACTGTTGTT 
TCCACCCCAG 
ATTTQAAGCC 
AGCCTA CCTA 
ACTTAOGTTT 
GAAATTAAAG 



ACTTCACXX5T 
ACCCAGCTCA 
AGACAAATGA 
CTATATGTTC 
CGACAAACAA 
ACCCCATGTG 
AGAACGGCTC 
06CCCT0SCC 
TCCCCTCCAA 
GCACCGCCAA 
TTGCAAAGAC 
TCCX3CGCCAT 
CCAACCATGA 
TTCGAGTAGA 
GTGTGCTGGT 
ATTTCATGTG 
TTACTCTGGA 
GTGGTTGGCC 
OGGACAGTAC 
TCCAGATGAC 
TGAGGGGCOG 
AGTACCTTCC 
ACTTACTTCA 
GAGAAACTCC 
TGTACCCATA 
TGTGAGTGTG 
AAGACACTTT 
TTTTGAAGGG 
ACCGGCCATT 
GGAGG6AGGG 
TTCTTCTGTT 
TGTCTTTTTA 
GACTGA6AGA 
AGAAAOGAAG 
ACTTTGTGGG 
TAGGTAGAAC 
ATGCTAAAGT 
TGGCCCCCAT 
TAATATTGCC 
CTCTCACAAA 
CCATAAAACC 
TGAGtlUUiTG 
ATTGAAAOGG 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



GTGCCACCCT 
TTTCTCTTGG 
ATTCCTCAGT 
AGTTCAGCCC 
GATTGAGATT 
GCCACAGTAC 
CTOGTCCACC 
CTAOGCACAG 
CACC3GACTAC 
GTCGGCCACC 
ATGCCCCATC 
GCCTGTCTAC 
GCTGAGCCGT 
GGGGAACAGC 
ACCTTATGAG 
TAACAGCAGT 
AACCAGAGAT 
AGGAAGAGAC 
AAAGAACGGT 
ATCCATCAAG 
TGAGACTTAT 
TCAGCACACA 
GAAACATCTC 
AAAACAATCT 
GAGCCCTATC 
TGTGTGTGTA 
GGCTCAGAGA 
ACTCAAACCT 
GGTGGGTGAG 
GTCAGGTGGG 
GTTTTTCTAA 
AGAAAAGGAG 
CTCAGTCAGA 
6TGTCAAGTG 
TGGAGAGTTC 
ATTTCTTAAT 
TTTTCTTGTA 
AGCAGGTGAA 
CTTACGTAGT 
ATCTGTGATT 
AGCCATATTA 
AGATGGAAfiC 
TAGACTACTT 



60 
120 
180 
240 
300 
360 
420 
480 
S40 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
ISOO 
1S60 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2S20 
2580 



195 



wo 02/086443 

TTCTTTTTTT TACTCAAAAG TTTAGAGAAT CTCTGTTTCT TTOCATTTTA AAAACATATT 2640 
TTAAGATAAT AGCATAAAGA CTTOAAAAT G nCCTCCCC TCCATCTTCC CACACCC AGT 2700 
CACXacCACT GTATTTTCTG TCACCAAGAC A ATGATTT CT TGrtMTG ftG GCTGTTGCTT 2760 
TIGTGGATGT GTGATTTTAA TTTTCAATAA ACTTTTGCRT CTTQCTTTAA AAGAAA 

Seq ID NO: 22 Protein sequence: 
Protein Accession i: NP_003713 

1 11 21 31 41 SI 

isQSTQT3JEP Lpevftjhiw dfleqpicsv QPIDLNFVDE PSEDGATOKI EISOCIRKQ 60 

DSDLSDPMWP QYTNWaLNS MDC»IQNGSS STSPYNTDEA QNSVTAPSPY AQPSSTFDAL 120 

SPSPAIPSNT DYPGPESPDV SFQQSSTAKS ATWTYSTBLK KLYOQIAKTC PIQIKVMTPP 180 

PQGAVIRAMP VYKKAEHVTB WKRCPNHEL SREFNEGQIA PPSHLIRVEG NSHAQWEDP 240 

ITGRQSVLVP YEPPQVGTEF TTVLYNFMCN SSCVGGMNRR PILIIVTLBT RDGQVLGRRC 300 

FEARICACPG RDRKADEDSI RKQQVSDSTK KGDGTKRPFR QNTHGIQMTS IKKRRSPDDE 360 

LLYLPVRGRE TYEMLLKIKE SLELMQYLPQ HTIBTYRQQQ QQQHQHLLOK HLLSACPRHE 420 
LVEPRRETPK QSOVFFRHSK PPSRSVYP 



Seq ID NO: 23 DMA sequence 

Nucleic Acid Accession NM_001944.1 

coding sequence: B4-3083 

1 11 21 31 41 51 

TTTTCTTAGA CATTAACTCC AGAOGGCTGG CAGGATAGAA GCAGCGGCTC ACTTGGACTT fiO 

TTTCACCAGG GAAATCAGAG ACAATGATGG GGCTCTTCCC CACAACTACA GGGGCTCTGG 120 

CCATCTTCGT GGTGGTCATA TTCGTTCATG GAGAATTGCX5 AATAGAOACT AAAGGTCAAT 180 

ATGATGAAGA AGAGATGACT ATGCAACAAG CTAAAAGAAG GCAAAAAOGT GAATGGGTGA 240 

AATTTGCCAA ACCCTGCAGA GAAGGAGAAG ATAACTCAAA AAGAAACCCA ATTGCCAAGA 300 

TTACTTCAGA TTACCAAGCA ACCCAGAAAA TCACCTACOG AATCTCTGGA GTGGGAATCG 360 

ATCAIGCCGCC TTTTGCAATC TTTGTTGTTG ACAAAAACAC TGGAGATATT AACATAACAG 420 

CTATAGTOGA COSGGAGGAA ACTCCAAGCT TCCTGATCAC ATGrOQGQCT CEAAATGCOC 480 

AAGGACTAGA TGTAGAGAAA CCACTTATAC TAACGGTTAA AATTTTGGAT ATTAATCSATA 540 

ATCCTCXAGT ATTTTCACAA CAAATTTTCA TGGGTGAAAT TGAAGAAAAT AGTGCCTCAA 600 

ACrCACTGGT GATGATACTA AATGCCACAG ATGCAGATGA ACCAAACCAC TTGAATTCTA 660 

AAATTGCCTT CAAAATTGTC TCTCAGGAAC CAGCAGGCAC ACCCATGTTC CTCCTAAGCA 720 

GAAACSWrrGG GGAACTCCGr ACTTTGACCA ATTCTCTTGA CCGAGAGCAA GCTAGCAGCT 780 

ATCGTCTGGT TGTGAGTGGT GCAGACAAAG ATGGAGAAGG ACTATCAACT CAATGTGAAT 840 

GTAATATTAA AGTGAAAGAT GTCAAOGATA ACTTOCCAAT GTTTAGAGAC TCTCAGTATT 900 

CAGCACGTAT TGAAGAAAAT ATTTTAAGTT CTGAATTACT TCGATTTCAA GTAACAGATT 960 

TGGATGAAGA GTACACAGAT AATTGGCTTG CAGTATATTT CTTTACCTCT GGGAATGAAG 1020 

GAAATTGGTT TGAAATACAA ACTGATCCTA GAACTAATGA AGGCATCCTG AAAGTGGTGA 1080 

AGGCTCTAGA TTATGAACAA CTACAAAGCG TGAAACTTAG TATTGCTGTC AAAAACAAAG 1140 

CTGAATTTCA CCAATCAGTT ATCTCTCGAT ACCGAGTTCA GTCAACXTCCA GTCACAATTC 1200 

AGGTAATAAA TGTAAGAGAA GGAATTGCAT TCCGTCCTGC TTCCAAGACA TTTACTGTGC 1260 

AAAAAGGCAT AAGTAGCAAA AAATTGGTGG ATTATATCCT GGGAACATAT CAAGCCATCG 1320 

ATGAGGACAC TAACAAAGCT GCCTCAAATG TCAAATATGT CATGGGACGT AACGATGGTG 1380 

GATACCTAAT GATTGATTCA AAAACTGCTG AAATCAAATT TGTCAAAAAT ATGAACCGAG 1440 

ATTCTACTTT CATAGTTAAC AAAACAATCA CAGCTGAGGT TCTGGCCATA GATGAATACft 1500 

CGGGTAAAAC TTCTACAGGC AOQGTATATG TTAGAGTACC CGATTTCAAT GACAATTGTC 1560 

CAACAGCTGT CXTTOGAAAAA GATGCAGTTT GCAGTrCTTC ACCrTOCGTG CTTGTCTCCG 1620 

CTAGAACACT GAATAATAGA TACACTGGCC CCTATACATT TGCACTGGAA GATCAACCTG 1680 

TAAAGTTGCC TCCCGTATGG AGTATCACAA CCCTCAATGC TACCT0G6CC CTCCTCRCAG 1740 

CCCAGGAACA GATACCTCCT GGAGTATACC ACATCTCCCT GGTACTTACA GACAGTCA6A 1800 

ACAATOGGTG TGAGATGCCA CJGCAGCTTGA CACTGGAAGT CTGTCAGTGT GACAACAGGG 1860 

GCATCTGTGG AACTTCTTAC CCAACCACAA GCCCTGGGAC CAGGTATGGC AGGCCGCACT 1920 

CAGGGAGGCT GGGGCCTGCC GCCATOGGCC TGCTGCTCCT TGGTCTCCTG CTGCTGCTGT 1980 

TGGCCCCOCT TCTCCTGTTG ACCTGTGACT GTGGGGCAGG TTCTACTGGG GGAGTGACAG 2040 

GTGGTTTTAT CCCAGTTCCT GATGGCTCAG AAGGAACAAT TCATCAGTGG GGAATTGAAG 2100 

GAGCCCATCC TCAAGACAAG GAAATCACAA ATATTTGTCT GCCTCCTGTA ACAGCCAATG 2160 

GAGCOGATTT CATGGAAAGT TCTGAAGTTT GTACAAATAC GTATGCCAGA GGCACAGCGG 2220 

TGGAAGGCAC TTCAGGAATG GAAATGACCA CTAAGCTTGG AGCAGCCACT GAATCTGGAG 2280 

GTQCTGCftGG CTTTCCAACA GQGACAGTGT CAGGAGCTGC TTCAGGATTC GGAGCAGCCA 2340 

CTGGAGTTCG CATCTGTTCC TCAGGGCAGT CTGGAACCAT GAGAACAAGG CATTCCACTG 2400 

GAGGAAOCAA TAAGGACTAC GCTGATGGGG CGATAAGCAT GAATrTTCTG GACTCCTACT 2460 

TTTCrCAGAA AGCATTTGCC TGTGCGGAGG AAGAOGATGG CCAGGAAGCA AATCACTGCT 2520 

TGTTGATCTA TGATAATGAA GGCGCAGATG CCACTGGrTC TCCTGTGGGC TCCXSTGGGTT 2580 

GTTGCAGTTT TATTGCTGAT GACCTGGATG ACAGCTTCTT GGACTCACTT GGACCCAAAT 2640 

TTRAAAAACT TGCAGAGATA AGCCTTGGTG TTGATGGTGA AGGCAAAGAA GTTCAGCCAC 2700 

CCTCTAAAGA CAGCGGTTAT GGGATTGAAT CCTGTGGCCA TCCCATAGAA GTCCAGCAGA 2760 

CAGGATTTGT TAAGTGCCA6 ACTTT6TCAG GAAGTCAAGG AGCTTCTGCT TTGTOOGOCT 2820 

CTGGGTCTGT CCAGCCAGCT GTTTCCATCC CTGACCCTCT GCAGCATGGT AA CTATT TAG 2880 

TAACGGAGAC TTACTCGGCT TCTGGTTCCC TCGTGCAACC TTCCACTGCA GGCTTTGATC 2940 

CACTTCTCAC ACAAAATGTG ATAGTGACAG AAAGGGTGAT CTGTCCCATT TCCAGTGTTC 3000 

CTCGCAACCT AGCTGGCCCA ACGCAGCTAC GAGGGTCACA TACTATGCTC TGTACAGAGG 3060 

ATCCTTGCTC CQGTCTAATA TGACCAGAAT GAGCTGGAAT ACCACACTGA CCAAATCTGG 3120 

ATCTTTGGAC TAAAGTATTC AAAATAGCAT AGCAAAGCTC ACTGTATTGG GCTAATAATT 3180 

TGGCACTTAT TAGCTTCTCT CATAAACTGA TCACXS^TTAT AAATTAAATG TTTGGGTTCA 3240 

TACCCCAAAA GCAATATGTT GTCACTCCTA ATTCTCAAGT ACTATTCAAA TTGTAGTAAA 3300 
TCTTAAAGTT TTTCAAAACC CTAAAATCAT ATTGGC 

Seq ID NO: 24 Protein sequence: 
Protein Accession #: NP__001935.1 

1 U 21 31 41 SI 



196 



wo 02/086443 
I 11 I I I 

JiMGLPPRTTG AIAIFVWIL VHGELRIETK GQYDEBEMTM QQAKRRQKRB WVKPAKPOE 60 

GEDNSKRNPI AKITSDYQAT QKITYRISGV GIDQPPFGIP WDKNTGDIN ITAIVDRBBT 120 

PSFLITCRAIj NAQGLDVEKP LILTVKILDI NDNPPVPSQQ IPMGBIEBNS AaiSLVMILN 180 

ATCADBPUEL NSKIAFKIVS QBPAGTPMPL LSHNTGEVRT LTNSU)REQA SSVRLWSGA 240 

DSOX^LSTQ CECNIKVKDV NDNFPMFRDS QYSARIEEWI LSSEU.RPQV TDU>3EYTDN 300 

HliAVYFFTSG NBGNHPEIQT DPRTNBGILK WKALDYEQL QSVKLSIAVK NKAEFHQSVI 360 

SRYRVQSTPV TIQVINVaBG lAFRPASKTP TVQKGISSKK LVDYILGIYQ AXOEDTinCAA 420 

SKVKYVMCSK DGGYUirDSK TAEIKFVKNK USDSTPIVIIK TITABVLAID EVTCKTSTGT 480 

VYVRVPDFND NCPTAVLEKD AVCSSSPSW VSARTLKNRY TGPYTFALED QPVKIiPAVWS 54 0 

rTTLKATSAT) LRAQEQIPPG VYHISLVLTD SQNNRCEMPR SLTLEVOQCD NRGIOGTSYP 600 

TTSPGTRYGR PHSGRU5PAA IGLIiLIiGZitiL LTtTAPLLLLT CEJOGAGSTGG VTGGFIPVPD 660 

GSKGTIKWG IB6AHPSDKE ITNICVPPVT AN6ADEMBSS EVCTNTYARG TAVBGT SGHS 720 

MTTKLGAATE SGGAAGFATG TVSGAASGFG AATSVGICSS GQSGTHRTRH STGGTOKDYA 780 

DGAISMNFLD SYFSQKAFAC AEEDDGQEAK DCLLIYDNEG ADATGSPVGS VGCCSFIADD 840 

LDDSFLDSLG PKFKKLAEIS LGVOGEGKEV QPPSKDSGYG lESCXaiPIEV QQTGFVKCQT 900 

LSGSQGASAL SASGSVQPAV SIPOPLQHGH YLVTETYSAS GSLVQPSTAG FDPLLTQUVI 960 
VTERVICPIS SVPCaOiAGPT QUIGSHTMLC TEDPCSRLX 

Seq ID KO: 25 DNA sequence 

Nucleic Acid Accession ft: Eos sequence 

Ccxiing sequence: 56*1642 

1 11 21 31 41 SI 

I I 1 I I I 

AGTATCCCAG GAGGAGCAAG TGGCAOGTCT TCGGACCTAG GCTGCXXXnX5 CCGTCATGTC 60 

GCAAGGGATC CTTTCTCCGC CAGCGGGCTT CCTGTCCGAT GACGATGTCG TAGTTTCTCC 120 

CATCTTTQAG TCCACAGCTG CAGATTTGGG GTCTGTGGTA CGCAAGAACC TGCTATCAGA 180 

CTGCTCTGTC GTCTCTACCT CCCTAGAGGA CAAGCAGCAG GTTCCATCrG AGGACAGTAT 240 

GGAGAAGGTG AAAGTATACT TGAGGGTTAG GCCCTTGTTA CCTTCAGAGT TGGAACX5ACA 300 

GGAAGATCAG GGTTGTGTCC GTATTGAGAA TGTGGAGACC CTTGTTCTAC AAGCACCCAA 360 

GGACTCTTTT GCOCTGAAGA GCRATGAACX5 GGGAATTGGC CAAGCCACAC ACAGGTTCAC 420 

CTTTTCCCAG ATCTTTGGGC CAGAAGTGGG ACAGGCATCC TTCTTCAACC TAACTGTGAA 480 

GGAGATGGTA AAGGATGTAC TCAAAGGGCA GAACTGGCTC ATCTATACAT ATGGAGTCAC 540 

TAACTCAGGG AAAACCCACA CGATTCAAGG TACCATCAAG 6ATGGAGGGA TTCTCCCCCG 600 

GTCCCTGGCG CTGATCTTCA ATAGCCTCCA AGGCCAACTT CATCCAACAC CTGATCTGAA 660 

GCCCTTGCTC TCCAATGAGG TAATCTGGCT AGACAGCAAG CAGATCOGAC AGGAGGAAAT 720 

GAAGAAGCTG TCCCTGCTAA ATGGAGGCCT CCAAGAGGAG GAGCTGTCCA CTTCCTTGAA 780 

GAGGRGTGTC TACATCGAAA GTCGGATAQG TACCAGCACC AGCTTOGACA GTGGCATTGC 840 

TGGGCTCTCT TCTATCAGTC AGT6TACX»G CAGTAGCCAG CTGGATGAAA CAAGTCATC3G 900 

ATGGGCACAG CCAGACACTG CCCCACTACC TGTCCCGGCA AACATTOGCT TCTCCATCTG 960 

GATCTCATTC TTTGAGATCT ACAACGAACT GCTTTATGAC CTATTAGAAC OGCCTAGCCA 1020 

ACAGCGCAAG AGGCAGACTT TGOGGCTATG CGAGGATCAA AATGGCAATC CCTATGTGAA 1080 

AGATCTCAAC TG6ATTCATG TGCAAGATGC TGAGGAG6CC TGGAAGCTCC TAAAAGTGGG 1140 

TCGTAAGAAC CAGAGCTTTG OCAGCACCCA CCTCAACX^G AACTCCAGCC GCAGTCACAG 1200 

CATCTTCTCA ATCAGGATCC TACACCTTCA GGGGGAAG6A GATATAGTCX: CCAAGATCAG 1260 

OGAGCTGTCA CTCTGTQATC TGGCTGGCTC AGAGGGCIGC AAAGATCAGA AGAGTGGTGA 1320 

ACGGTTGAAG GAAGCAGGAA ACATTAACAC CTCTCTACSIC ACCCPGGOCC GCTGTATTGC 1380 

TGCCCTTCGT CAAAACCAGC AGAACCGGTC AAAGCAGAAC CTGGTTCCCT TCCGTGACAG 1440 

C3UV3TTGACT CGAGTGTTCC AAGGTTTCTT CACAGGCOGA GGCOGTTCCT GCATGATTGT IS 00 

CAATGTGAAT CCCTGTGCAT CTACCTATGA TGAAACTCTT CATGTGGCCA AGTTCTCAGC 1560 

CATTGCTAGC CAGGT6ACTT GTGCA1GCCC CACCTATGCA ACTGGGATTC CCATCCCTGC 1620 

ACTCX3TTCAT CAAGGAACAT AGTCTTCAGG TATCCCCCAG CTTAGAGAAA GGGGCTAAGG 1680 

CAGACACAGG CCTTGATGAT GATATTGAAA ATGAAGCTGA CATCTCCATG TATGGCAAAG 1740 

AGGAGCTCCT ACAAGTTGTG GAAGCCATGA AGACACTGCT TTTGAAGGAA CGACAGGAAA 1800 

AGCTACAGCT GGAGATGCAT CTCOGAGATG AAATTTGCAA TGAGATGGTA GAACAGATGC 1860 

AACAGCX3QGA ACAGTGGTGC AGTGAACATT TGGACACCCA AAAGGAACTA TTGGAGGAAA 1920 

TGTATGAAGA AAAACTAAAT ATCCTCAAGG AGTCACTGAC AAGTTTTTAC CAAGAAGAGA 1980 

TTCAGGAGCG GGATGAAAAG ATTGAAGAGC TAGAAGCTCT CTTGCAGGAA GCCAG ACAAC 2040 

AGTCAGTQGC CCATCAGCAA TCAGGGTCTG AATTGGCOCT ACGGOGGTCA CSiAAGGTTGG 2100 

CAGCTTCTGC CTCCACCCAG CAGCTTCAGG AGGTTAAAGC TAAATTACAG CAGTGCAAAG 2160 

CAGAGCTAAA CTCTACCACT GAAGAGTTGC ATAAGTATCA GAAAATGTTA GAACCACCAC 2220 

CCTCAGCCAA GCCCTTCACC ATTGATGTGG ACAAGAAGTT AGAAGAGGGC CAGAAGAATA 2280 

TAAGGCTGTT GCGGACAGAG CTTCAGAAAC TTGGTGAGTC TCTC CAATCa, GCAGAGAGAG 2340 

CTTGTTGOCA CACCACTGGO OCAGGAAAAC TTCGTCRAGC CTTGACCACT TCTGATGACA 2400 

TCTTAATCAA ACAGGACCAfi ACTCTGGCTG AACTGCAGAA CAACATGGTG CTAGTGAAAC 2460 

TGGACCTTOG GAAGAAGGCA GCATGTATTG CTGAGCAGTA TCATACTGTG TTGAAACTCC 2520 

AAGGCCAGGT TTCTGCCAAA AAGCGCCTTG GTACCAACCyV GGAAAATCAG CAACCAAACC 2580 

AACAACCACC AGGGAAGAAA CX:ATTCCTTC GAAATTTACT TCCCCGAACA CCAACCTGCC 2640 

AAAGCTCAAC AGACTGCAGC CCTTATGCCC GGATCCTACG CTCACGGOGT TCCCCTTTAC 2700 

TCAAATCTGG GCCTTTTGGC AAAAAGTACT AAGGCTGTGG GG AAAGAG AA GAGCA GTCAT 2760 

GGCCCTGAGG TGGGTCAGCT ACTCTCCTGA AGAAATAGGT CTCTTTTATG CTTTACCATA 2820 

TATCAGGAAT TATATCCAGG ATGCAATACT CAGACACTAG CTTTTTTCTC ACTTTTGTAT 2880 

TATAACCACC TATGTAATCT CATGTTGTTG rTETTTTTTA TTTACTT ATA TGATTTCTAT 2940 

GCACACAAAA ACAGTTATAT TAAAGATATT ATTGTTCACA TTTTTTATTG AATTCCAAAT 3000 
GTAGCAAAAT CATTAAAACA AATTATAAAA GGGACAGAAA AA 

Seq ZD NO: 26 Protein sequence: 
Protein Accession Eos sequence 

1 11 21 31 41 SI 

1 I I i 1 i 

MSQGXLSPPA GZtLSDDOWV SPMFESTAAD LGSWRKNLL SDCSWSTSL EDKQQVPSED 60 

SMEKVKVYLR VRPLLPSELE RQCDQGCVRI ENVETLVIiQA PKDSFALKSS EEGIGQATHR 120 

FTFSQIFGPE VGQASPFNLT VKEMVKDVLK GQNMI,rrrYG VTNSGKTHTI Q GTIKDG GIL 180 

PRSLALIFNS LQGQLHPTPD LKPIiLSNEVI WLOSXQIBQE EMKKLSLIiNG GU2EEBLSTS 240 

LKRSVYIESR IGTSTSPDSG lAGLSSISQC TSSSQLDETS HRHAQPDTAP LPVPANIRPS 300 
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IWISFPBIYN ELLYDIiLEPP SQQRKRQTLR LCEDQS(2JPY VKDIjKWIHVQ DAEEAKKLLK 360 
VOaKNQSFAS THLKQ27SSRS BSIP3IRII£ LQGB6DIVFR ISBLSLCDLA GSERCXDQKS 420 
GERLKEAGNI NTSLETLGRC lAAIiRjQNQQM RSKQNLVPFH. DSKLTRVFQG FFTGRGRSCM 480 
rVNVNPCAST YDETLHVAKF SAIASQVTCA CPTYATGIPI PALVHQGT 



PCT/US02/12476 



Seq ID ifO: 27 DNA sequence 

iTucleic Acid Accession i : Eos sequence 

Coding sequence: 13-1424 



10 



15 



20 



25 



30 



35 



40 



1 
I 

TAGAAGTTTA 
CTTCCOCTGA 
TTAGAAAAAT 
GGAAACTTAA 
GGGCAACTGG 
GTCCATCATT 
AGAATCAATA 
GCTTTCCAAG 
GCTGACATTT 
AAAGGTGGAA 
TTC6ATGAGG 
GTTCACGyVGA 
TTCCCCACCT 
GGCATTCAGT 
TCAGAAOCAG 
AAGATCTTTT 
AGTGTTAATT 
GAAATTGAAG 
AATTTAAGAC 
GTGAAAAAAA 
GATAACCAGT 
CTGATTACCA 
AACAAATACT 
CGTATCACCA 
TGGTTTTTGT 
GTGTACCACT 
TTATATAAAA 
CTCTACTATT 
CTCTGTAAGT 
TAAAATTAAG 



11 
I 

CAATGAAGTT 
ACAGCTCTAC 
TTTATGGCCT 
TGAAGGAAAA 
ACACATCTAC 
TCAGGGAAAT 
ATTACACACC 
TATGGAGTAA 
TGGTGGTTTT 
TCCTAGCCCA 
ACGAATTCTG 
TTGGCCATTC 
ACAAATATGT 
OOCTGTATGG 
CrCTCTGTGA 
TCTTCAAAGA 
TAATTTCTTC 
CCAGAAATCA 
CAGAGCCAAA 
TTGATGCAGC 
ATTGGAGGTA 
AGAACTTCCA 
ACTATTTCTT 
AAACACTGAA 
TAGTTCACTT 
ACTTAGAGAT 
TACATAATAT 
AA6TTTGAAA 
TGCTTCCTAA 
TATATATATT 



21 
I 

TCTTCTAATA 
AAGCCTGGAA 
TGAGATAAAC 
AATCCAAGAA 
CXrrCGAGATG 
GCXAGGGGGG 
TGACATGAAC 
TGTTACCCCC 
TGCCGGTGGA 
TGCTTTXGSA 
GACTACACAT 
CTTAGGTCTT 
TGACATCAAC 
AGAGCCAAAA 
COCCAATTTG 
CAGGTTCTTC 
CTTATGGCCA 
AGTTTTTCTT 
TTATCCCAAG 
TGTTTTTAAC 
TGATGAAAGG 
AGGAATOGGG 
CCAAGGATCT 
AAGCAATAGC 
CAGCTTAATA 
ATGTATCATA 
TTTTCAATTT 
ATAGTTACCT 
CATOCTTGGA 
TTGGCTCAAA 



31 
I 

CTGCTCCTGC 
AAAAATAATG 
AAACTTCCAG 
ATGCAGCACT 
ATGCACGCAC 
CCCGTATGGA 
OGTGAGGATG 
TTGAAATTCA 
GCTCATGGAG 
CCTGGATCTG 
TCAGGAGGCA 
GGCCATTCTA 
ACATTTCGCC 
6AGAACX:»AC 
ACTTTTGATO 
TG6CTGAAGG 
ACCTTGCCAT 
TTTAAAGATG 
AGCATACATT 
CCACGTTTTT 
AGACAGATGA 
CCTAAAATTG 
AACCAATTTG 
TGGTTTGGTT 
AGTATTTATT 
AAAATAAAAT 
TGAAAACTCT 
TCAAAGCAAG 
CTGAGAAATT 
TAAAATTG 



41 

1 

AGGCCACTGC 
TGCTATTTGG 
TGACAAAAAT 
TCTTGGGTCT 
CTCGATGTGG 
GGAAACATTA 
TTGACTACGC 
GC3UU3ATTAA 
ACTTOCATGC 
GCAITGGAGG 
CAAACTTGTT 
GTGATCCAAA 
TCTCTGCTGA 
GCTTQOCAAA 
CTGTCACTAC 
TTTCTGAGAG 
CTGGCATTGA 
ACAAATACTG 
CTTTTGGTTT 
ATAGGACCTA 
TGGACCXrTGG 
ATGCAGTCTT 
AATATGACTT 
GTTGAAAAT6 
GCATATTTGC 
CTGTAAACCA 
AATTGTCCAT 
ATAATTCTAT 
ATACTTACTT 



SI 
I 

TTCTGGACCT 
TGAAAGATAC 
GAAATATAGT 
GAAAGTGACC 
AGTCCCCGAT 
TATCACCTAC 
AATCC3GGAAA 
CACAGGCATG 
TTTTGATG6C 
GGATCCACAT 
CCTCACTGCT 
GGCOGTAATG 
TGACATAOGT 
TCCTGACAAT 
CGTGGGAAAT 
ACCAAAGACC 
AGCTGCTTAT 
GTTAATTAGC 
TCCTAACTTT 
CTTCTTTGTA 
TTATCCCAAA 
CTACTCTAAA 
GCTACTCXAA 
GTGTAATZAA 
TATGTCCTCA 
TAGGTAATGA 
TCTTGCTTGA 
TTGAAGCATG 
CTG6CATAAC 



60 
120 

leo 

240 
300 
360 
420 
480 
540 
600 
660 
720 
7fi0 
840 
900 
.960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 



45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NO: 28 Protein sequence: 
Protein Accession #i Bos sequence 



MKFLLILLLQ 
KEKIQEMQHF 
YTPDMNREDV 
LABAFGPGSG 
KZVDXNTFRL 
FKDHFFWLKV 
BPHYPKSIHS 
KFQGIGPKID 



11 
1 

ATASGAIiPUI 
LGUCVTGQLD 
DYAIRKAFQV 
IGGDAHPDED 
SADDIRGIQS 
SERPKTSVNL 
FGFPNFVKKI 
AVFYSKNKyY 



21 

1 

SSTSLEKNNV 
TSTLEMMHAP 
WSNVTPliKPS 
EFHTTHSGGT 
LYGDPKENQR 
ISSTMPTUPS 
DAAVFNPRFY 



31 

i 

LF6BRYLEKF 
RCGVPDVHHP 
KINTGMADIL 
NLPLTAVHBI 
LPHPDNSEPA 
GZEAAXEZEA 
RTYPFVDNQY 
YDFLLQRITK 



Seq 20 NO: 29 DNA sequence 

KUcleic Acid Accession #: NM_006115.1 

Coding sequence t 236..176S 



GCTTCAGGGT 
OGGGACACCC 
AGTCTCTGAG 
GAGACCTAGA 
ACGAAGGOGT 
CCCAOGGAGA 
TGCCX3CCCTG 
OGGGAGACAC 
TCXG6GAGTG 
TG6ACTTGAT 
GGATTTACGG 
TCTGTACTCA 
TGGTTTGAGC 
CCTCAAGGAA 
GAAAAATGTA 
TATCAAGATG 
TACCTG6AAG 
GOGTACACTC 
GCAGTATATC 
TGTGGACTCT 
CCXXmGGAA 
6TCCCAGAGT 
OGA3X3TAAGT 
OCTGGTCTTT 
GAGCCACTGC 



11 

1 

ACAGCTCXCC 
CACCOGCTTC 
GAAAAACCAT 
AATCCAAGGG 
TTGTGGGGTT 
CTPGTGGAGC 
GAGTTGCTGC 
AGCCAGACrC 
CTGAT6AAGG 
GTGCTCCTTG 
AAGAACTCTC 
TTTCCAGAGC 
ACAGAGGCAG 
GGTGCCTGTG 
CTAOGCCTGT 
ATCCTGAAAA 
CTAOCCACCT 
CTCCTCTCCC 
GCCCAGTTCA 
TTATTTTTCC 

ACCcrcrcAA 

CXXAGCGTCA 
OCOGAGCXXr 
GATGAGTGT6 
TCCCAGCTTA 



21 
I 

CGCAGCCAGA 
CCAGGCGTGA 
TTTGATTATT 
TTGGAGGTCC 
CCATTCAGAG 
TGGCAGGGCA 
CCAGGGAGCT 
TGAAGGCAAT 
GACAACATCT 
CCCAGGAGGT 
ATCAGGACTT 
CAGAAGCAGC 
AGCAGCCCTT 
ATGAATTGTT 
GCTGTAAGAA 
TGGTGCAGCT 
TGGOGAAATT 
ACATCCATGC 
CCTCTCAGTT 
TTAGAGGCCG 
TAACTAACTG 
GTCAGCTAAG 
TCCAAGCTCT 
GGATCAGGGA 
CAACCTTAAG 



31 
I 

AGCCGGGCCT 
CCTGTCAACA 
ACTCTCAGAC 
TGAGGCCAGC 
CCGATACATC 
GAGCCTGCTG 
CTTCCCGCCA 
GGTGCAGGCC 
TCACCTGGAG 
T0GCCCCA6G 
CTOGACrGTA 
TCA6CCCATG 
CATTCCAGTA 
CTCCTACCTC 
GCTGAAGATT 
GGACTCTATT 
TTCTCCTTAC 
ATCTTCCTAC 
CCPCAGTCTG 
CCTGGATCAG 
CCGGCTTTOG 
TGTCCTGAGT 
6CTGGAGAGA 
TGATCAGCTC 
CTTCTAOGGG 



41 
I 

YGLEXNKLPV 
REKPGGPVWR 
WPARGAHGD 
GHSLGLGHSS 
LCDPHLSFDA 
RNQVFLFKDO 
WRYDERRQMM 
TLKSNSWFGC 



41 
I 

GCAGCCCCTC 
GCAACTTGGC 
GTGOGTGGCA 
CTAAGTCGCT 
AGCATGAGTG 
AAGGATGAGG 
CTCTTCATGG 
TGGCCCTTCA 
ACCTTCAAAG 
AGGTGGAAAC 
TGGTCTGGAA 
ACAAAGAAGC 
GAGGTGCTCG 
ATTGAGAAAG 
TTTGCAATGC 
GAAGATTTGG 
CTGGGCCAGA 
ATTTCCCCXSG 
CAGTGCCTGC 
TTGCTCAGGC 
GAAGGGGATG 
CTAAGTGGGG 
GCCTCTGCCA 
CTTGCOCTCC 
AATTCCATCT 



51 
I 

TKMKYSGNLH 
KHYITYRXmi 
FHAFDGKGGI 
DPKAVMFPTY 
VTTVGNKIFF 
KYWLISNLRP 
DPGYPKLITK 



51 

I 

AGCACCGCTC 
GGTGTGGTGA 
ACAAGTGACT 
TCAAAATGGA 
TGTGGACAAG 
CCCTGGOCAT 
CAGCCTTTGA 
CCTGCCTCCC 
CTGTGCTTGA 
TTCAAGTGCT 
ACAGGGCCAG 
OAAAAGTAGA 
TAGACCTGTT 
TGAAGCGAAA 
CCATGCAGGA 
AAGTGACTTG 
TGATTAATCT 
AGAAGGAAGA 
AGGCTCTCTA 
ACGTGATGAA 
TGATGCATCT 
TCATGCTGAC 
CCCTCCAGGA 
TGCCTTCCCT 
GCATATCTGC 



60 

120 
160 
240 
3Q0 
360 
420 



60 
120 
180 
240 
300 
'360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



CTTGCAGAGT 
TGTCOCCCTG 
TCTGCATGCC 
TAGTGCCAAC 
GTGCCCCTGT 
TTGGACACTA 
ACAAATGTTC 
GTTCAGTGAG 
GTGATCTTTG 
GATTCTGGCT 
TGTTGAAAAT 



CTOCTGCAGC 
GAGAGTTATG 
AGGCTCAGGG 
CCCTGTOCTC 
TTCATGOCTA 
AAGOCAGGAT 
AGTGTGAGTG 
GAAAAAAAGG 
GGGAGATACA 
TGOGAAGTAC 
AAAGAGAAGC 



ACCTCATOGG 
AGGACATCCA 
AGTTGCTCTG 
ACTGTQGGGA 
ACTAGCTGGG 
GTGCATGCAT 
AGGAAAACAT 
GGAAGTTOGG 
TCTTATAGAG 
ATGTAGGAGT 
AATGTGAAGC 



GCTGAGCAAT 
TGGTACCCTC 
TGAGTTGGGG 
CAGAACCrrC 
TGCACATATC 
CTTGAAGCAA 
GTTCAGTGAG 
GATA6GCAGA 
TTAGAAATAG 
TAATCCCTGT 
AAAAAAAAAA 



CTGAOCCAOG 
CACCTGGAGA 
CGGCOCAGCA 
TATCSICOCXSG 
AAATGCTTCA 
CAAAGCAGCC 
GAAAAAACAT 
TGTTGACTTG 
AATCT GAATT 
GTAGACTGTT 
AAAAAAAA 



TGCTGTATCC 
GGCTTGCCTA 
TGGTCTGGCT 
AGCCCATOCT 
TTCTGCATAC 
ACAGTTTCAG 
TCAGACAAAT 
AGGAGTTAAT 
TCTAAAGGGA 
GTAAAGAAAC 



Seq ZD RO: 30 Protein sequence: 
Protein Accession S: KP_006106.1 



1 

1 

GCTTGAGGGT 
GGGGACACCC 
ACTCTCTGAG 
GAGACCTAGA 
ACGAAGGCGT 
CCCAOGGAGA 
TGCOGCCCTG 
OGGGAGACAC 
TCTGGGAGTG 
TGGACTTGAT 
GGATTTACGG 
TCTGTACTCA 
TGGTTTGAGC 
CCTCAAGGAA 
GAAAAATGTA 
TATCAAGATG 
TACCTGGAAG 
GCGTAGACTC 
GCAGTATATC 
TGTGGACTCT 
CCCCTTGGAA 
GTCCCAGAGT 
CGATGTAAGT 
CCT6GTCTTT 
GAGCCACTGC 
CTTGCAGAGT 
TGTCCCCCT6 
TCTGCATGOC 
TAGTGCCAAC 
GTGCCCCTGT 
TTGGACACTA 
ACAAATGTTC 
GTTCAGTGAG 
GTGATCTTTG 
GATTCTGGCT 
TGTTGAAAAT 



11 
I 

ACAGCTCOCC 
CACCOSCTTC 
GAAAAACCAT 
AATCCAAGCG 
TTGTGGGGTT 
CTTCyrGGAGC 
GAGTTGCTGC 
AGCCAGACCC 
GTQATGAAGG 
GTGCTCCTTG 
AAGAACTCTC 
TTTCCAGAGC 
ACAGAGGCAG 
GGTGCCTGTG 
CTACGCCTGT 
ATCCTGAAAA 
CTACCCACCT 
CTCCTCTCCC 
GCCCAGTTCA 
TTATTTTTCC 
ACCCTCTCAA 
CCCAGCGTCA 
CCCGA6CCCC 
GATGAGTGTG 
TCCCAGCTTA 
CTCCTGCAGC 
GAGAGTTATG 
AGGCTCAGGG 
CCCTGTCCTC 
TTCATGCCTA 
AAGCCAGGAT 
AGTGTGAGTG 
GAAAAAAAGG 
GGGAGATACA 
TGGGAAGTAC 
AAAGAGAAGC 



21 
I 

CGCAGCCAGA 
CCAGGCGTGA 
TTTGATTATT 
TTGGAGGTCC 
CCATTCAGAG 
TGGCAGGGCA 
CCAGGGAGCT 
TGAAGGCAAT 
GACAACATCT 
CCCAGGAGGT 
ATCAGGACTT 
CAGAAGCAGC 
AGCAGCCCTT 
ATGAATT6TT 
GCTGTAAGAA 
TGGTGCAGCT 
TGGCGAAATT 
ACATCCATGC 
CCTCTCAGTT 
TTAGAGGCCG 
TAACTAACTG 
GTCAGCTAAG 
TCCAAGCTCT 
GGATCACGGA 
CAACCTTAAG 
ACCTCATCGG 
AGGACATCCA 
AGTTGCT6TG 
ACTGTQGGGA 
ACTAGCTGGG 
GTGCATGCAT 
AGGAAAACAT 
GGAAGTTGGG 
TCTTATAGAG 
ATGTAGGAGT 
AATGTGAAGC 



31 
I 

AGCCGGGCCT 
CCTGTCAACA 
ACTCTCAGAC 
TGAGGCCAGC 
CC6ATACATC 
GAGCCTGCTG 
CTTCCCGCCA 
G6TGCA6GCC 
TCACCTGGAG 
TCGCCCCAGG 
CTGGACTGTA 
TCAGCCCATG 
CATTCCAGTA 

crrcTACCTc 

GCTGAAGATT 
GGACrCTATT 
TTCTOCTTAC 
ATCTTCCTAC 
CCTCAGTCTG 
CCTGGATCAG 
CCGGCTTTOG 
TGTGCTGAGT 
GCTGGAGAGA 
TGATCAGCTC 
CTTCTACGGG 
GCTGAGCAAT 
TGGTACCCTC 
TGAGTTGC3GG 
CAGAACCTTC 
TGCACATATC 
CTTGAAGCAA 
GTTCAGTGAG 
GATAGGCAGA 
TTAGAAATAG 
TAATCCCTGT 
AAAAAAAAAA 



41 
I 

GCAGOGCCTC 
GCAACTTCGC 
GTGOGTGGCA 
CTAAGTCGCT 
AGCATGAGTG 
AAGGATGAGG 
CTCTTCATGG 
TGGCCCTTCA 
ACCTTCAAA6 
AOGTGGAAAC 
TGGTCTGGAA 
ACAAAGAAGC 
GAGGTGCTCG 
ATTGAGAAAG 
TTTGCAATGC 
GAAGATTTGG 
CTGGGGCAGA 
ATTTCCCOGG 
CAGTGCCTGC 
TTGCTCAGGC 
GAAGGGGATG 
CTAAGTGGGG 
GCCTCTGCCA 
CTTGCCCTCC 
AATTCCATCT 
CTGACCCACG 
CACCTGGAGA 
CGGCCCAGCA 
TATGACCOGG 
AAATGCTTCA 
CAAAGCAGCC 
GAAAAAACAT 
TGTTGACTTG 
AATCTGAATT 
GTAGACTGTT 
AAAAAAAA 



51 
I 

AGCACCGCTC 
GGTGTGGTGA 
ACAAGTGACT 
TCAAAATGGA 
TGTGGACAAG 
CCCTGGCCAT 
CAGCCTTTGA 
CCTGCCTCCC 
CTGTGCTTGA 
TTCAAGTGCT 
ACAGGGCCAG 
GAAAAGTAGA 
TAGACCTGTT 
TGAAGCGAAA 
CCATGCAGGA 
AAGTGACTTG 
TGATTAATCT 
AGAAGGAAGA 
AGGCTCTCTA 
AOGTGATGAA 
TGATGCATCT 
TCATGCTSAC 
CCCTOCAGGA 
TGCCTTCCCT 
CCATATCTGC 
TGCTGTATCC 
GGCTTGCCTA 
TGGTCTGGCT 
AGCCCATCCT 
TTCTGCATAC 
ACAGTTTCAG 
TCAGACAAAT 
AGGAGTTAAT 
TCTAAAGGGA 
GTAAAGAAAC 



Seq ID NO: 31 DNA sequence 

Nucleic Acid Accession Bos sequence 

Coding sequence: 64-2754 



1 
I 

GGCAGGTCTC 
CCGATGGCCG 
CTGACCCTGG 
CCTTCTAAAC 
TCTGCAGACC 
TACACAGCCA 
GACAAAAGGA 
TCGAAGACAA 
ATTCCTTGCT 
GAATCTGAT6 
AAAGAACCTT 
CCTGTGGATC 
GGATATTCAG 
CACCCTGTTT 
ACTACAGTGG 
CTGAAATACA 
AGCACAGGOG 
TCATTGATAA 
ACTTGTATCA 
TATGAAGCAT 
GATAAGGATT 
GAAAATGGAC 
GTAAAGCCAC 
GAAGCGCCAT 
GTTCATGTGA 
ATTAAAGAAA 



11 
I 

GCTCTCGGCA 
CCGCTGGGCC 
TGATCTTCAG 
TAGAGGCAGA 
TCATCCGGTC 
GGGCTGTTGC 
AACAGACACA 
GACACACTAG 
CTATGCAAGA 
CAGCACAGAA 
TAAATTTGTT 
GTGAAGAATA 
CAGATCTGCC 
TCACAQAAGC 
GGGTGGTTTG 
GCATTTTGCA 
TAATCACCAC 
TGAAAGTACA 
TAACAGTAAC 
TTGTAGAGGA 
TAATTAACAC 
ATTTCAAAAT 
TGAATTATGA 
TTGCTAGAGA 
GGGATCTG6A 
ACTTAGCAGT 



21 
I 

CCCTCCCGGC 
CCGGCGCTCC 
TCGTGATGGT 
CAAAATAATT 
AAGTGATCCT 
GCTGTCTGAT 
GAAAGAGGTT 
AGAAACTGTT 
GAATTCCTTG 
CTATACTGTC 
TTATATAGAA 
TGATGTTTTT 
CCTCCCACTA 
AATTTATAAT 
TGCCACAGAC 
GCAGACACCA 
AGTCTCTCAT 
AGACATGGAT 
AGATTCAAAT 
AAATGCATTC 
TGCCAATTGG 
CAGCACAGAC 
AGAAAACCGT 
TATTOCCAGA 
TGAGGGGCCT 
GGG6TCAAAG 



31 

1 

GCCCGCGTTC 
GTGCGCGGAG 
GAA6CCTGCA 
GGCAGAGTTA 
GATTTCAGAG 
AAGAAAAGAT 
ACTGTGCTGC 
CTCAGGCGTG 
GGCCCTTTCC 
TTCTACTCAA 
AGAGACACTG 
GATTTGATTG 
CCCATCAGGG 
TTTGAAGTTT 
AGAGATGAAC 
AGGTCACCTG 
TATTTGGACA 
GGCCAGTTTT 
GATAATGCAC 
AATGTGGAAA 
AGAGTCAATT 
AAAGAAACTA 
CAAGTGAACC 
GTGACAGCCT 
GAATGCACTC 
ATCAACQGCT 



1560 
1620 
1680 
1740 

aeoo 

1660 
1920 
1980 
2040 
2100 



60 
120 
ISO 
240 
300 
360 
420- 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
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AATAGAAATG GCAATGGTTT AAGGTACAAA 
ATTGftTGAAA TTTCAGGGTC AATCATAACT 
CCCAAAAATG AGTTGTATAA TATTACAJGTC 
ACTGGAACAC TTGCTGTGAA CATTGAAGAT 
5 GAATATGTAG TC3VTTTGCAA ACCAAAAATG 
GATGAACCTG TCCA7GGAGC TCCATTTTAT 
AGTACAC7GT GGAGCCTCAC CAAA6TTAAT 
AATGCIGGAT TTCMGAATA TACCATTCCT 
GCAACAAAAT TATTGAGAGT TAATCTGTGT 

10 ACTTCAAGGA GTACAGGAGT AATACTTGGA 
ATAGCACTGC TCTTTTCTGT ATTGCTAACT 
GGQAAAGGTT TTCCTGAAGA TTTAGCACAG 
OC7GQAGA06 ATASAGTQTG CTCTGOCAAT 
AGCCRAGGTT TTTGTGGTAC TATGGG ATCA 

15 GAAATGATGA AAGGAGGAAA CCAGACCTTG 
ACOCTGGACT CCTGCAGGGG AGGACACAOG 
GAGTGGCACA GTTTTACTCA ACCCOGTCTC 
GAAGACCGCA TGCCATCCCA AGATTATGTC 
CCAGCTGGTT CTGTGGGCTG CTGCAGTGAA 

20 AATAATTTGG AACCCAAATT TATTACATTA 
AGTGCTACAA TTAGGTCTTT GTCAGACATT 
TTCAATTTCA ACATGTATGT ATATGATGAT 
CCAATTTATA TTTTTAAAGC CAGTTGTTGC 
AACAGACAAC TGGTAAATCT CAAACTCCAG 

25 TCrrTTTTTT TTTTACX5GAT ATTTTAGTAA 
ATAGGTAAGT TAT6CTAATA TCACATTATT 
ATAAACAAGA AATATTGAGT ATCACTATGT 
ACTGAATTAA ATTAAAAATG TTGCAGCTCA 
ACCAAATTCA TTTGACTTTG GAGGCAAAAT 

30 TCTATAGGAA TATAGTTGGA AATAAATGTG 
ATTTAAAATG AAATGAGAAC AAAGAGGAAA 
TAJSTTTGTCC TACAATAGAA AAAAGAGAjGA 
ATTATAACTG AGTCTATGAG GAAATA GTTC 
GTAAATAAAT TAAACTTTTC TGGTTTCTGT 

35 TAGCTTTGCT TTGCAGTCTG TTTCAAGATT 
GAATACTCX3C TGCAGCTGGG GTTCCCTGCT 
TTTTTTTCGG GGAGCT AATA ACAAAAACAT 
CTCTATTGCT GTTTCTATTC TCTCTTATAG 
TAACCATGTC CTOCTAGAGT TTAGAGGCTA 

40 GCACCCTGGG GAGATTGATT GTCCTTAAAC 
GTCTGGGAGC TACAAAATTT CATTTTTCTC 
CTGAATCAAG GAAA6CCAGG CCTTGTGGGC 
ACCTCCAGCA GAGATTCCCT TAAGTGACTC 
AATTTTTAAT CAGTTTGCTT TCTCCAGAGA 

45 TTGAATGTAT AAAAGAAAAA GATCAAGTTG 
AAOCAGCCCA AGTAGGTTAT TTGTACAGTC 
GGGCAAGGAG AGGCCACAAG GAATATGGGT 

crrrrrccTA ggctpggcac tgccttttcc 
gtccggtgag ggatcagcca acctcttctc 

50 aaggagacag agctgactgc atgatgagtc 
gtt6t6ca6a acaaacaagg cattcatggg 
ctgggcacta agaaggtcta tgaattaaat 
ttttctgttt tctaatttga occtaaaatc 
occccccccc tttttttttg agacggagtc 

55 gctcxx3atct ctgctcactg aaagctcogc 
gcctcctgag tagctgggac tacaggcgcc 

TTTAATAGAG ACGGGGTTTC ACTGTGTTAG 
ATCCXSCCTGC CTCGGCCTCC CAAAGTGCTG 
CTTGTTTTCC GTTTAAAGTC GTCTTCTTTT 

60 TGATCATACG AAT7G0ATCA ATCTTGAAAT 
GGQAGAAAGA ACTCAGGGCA CAAAATATTG 
TTGCTGAAAT TTCCTGCTGT AACCAGAAGC 
ACTGTGTTTT GCTCACTCCC TCACTCACCG 
CTAGTGCCGA TAAACTTTCT CAAAGAGCAA 

65 TAACCATGTC TTTGTTCTTT GAACATGCTG 
TTTGTAATTC TTTTCTCTCA AATQAAAATT 
CATATGTAGT ATTATTATTT CCTPATATGT 
CAAGAAAATA TATTTTTAAA GCTTTCATTT 
TCTAAATATA CAGAATGTTT TTTCTTACTT 

70 GGGGTTTGTT TTGCAATGTT TTAAACAGAG 
TGCTTTTAAA GAAACTTGGC TGCTTAAAAT 
TACAGATGTG GGGAGATGTA ATAAAACAAT 
TA6AGATTAA ATAATTCTAA GATGATCACT 
AATAGAAATA CTCAATTATG TCTTTGTTGT 

75 ATTATCAAAT TGTCGACATC ATTAATATAT 
GAAGCACAGC TTTACAGATG AGTATCTATG 
ATTAAAAGTA TTAGAAGGTG GTT ATAATTG 
AGGGGTTTTA CTTTGAGGAC CAGTGTAGTC 

_ GGCAATATTG CAGTCTTGAT TCTGCCACTT 

OO CAAGATGATC CAACCATAAA GGTGCTCTGT 
AGTGTGCTCC CCTACAAACG TTAACACTGA 
AGCCTTACAT TTTAATATAG GTTGAACCAA 
CATTATTTTT GTGTATGTCT TCAAGAATGT 
ACCGGATACA TTTCACGTGT CCTTCAGTAT 

85 TGA6AAGCAT GGACACTAOA GCCAGAAT6C 
TTCTGTGTGA CCTTTGAAAG GCTACTTATT 
GAACAATGCC AGCCTCATG6 GGTTGTTGAA 



AAATTGCATG ATCCTAAAGG TTGGATCAOC 1620 

TCCAAAATOC TGGAXAGGGA GGTTGAAACT 1680 

CIGGCAAXAG ACAAAGATGA TAGATCATGT 1740 

GTAAATGATA ATCCACCAGA AATACTTCAA 1800 

GGGTATACC3G ACATmAGC TGTTGATCCT 1860 

TTCAGTTTGC CCAATACTTC TCCAGAAATC 1920 

GATACAGCTG CCOGTCTTTC ATATCAGAAA 1980 

ATTACrGTAA AAGACAGGGC OGGCCAAGCT 2040 

GAATGTACTC ATCCAACTCA GTGTOGT606 2100 

AAATGGGCAA TCCTTGCAAT ATTACTGGGT 2160 

TTAGTATGTG GAGTTTTTGG TGCAACTAAA 2220 

CAAAACTTAA TTATATCAAA CACAGAAGCA 2280 

GGATTTATGA CCCAAACTAC CAACAACTCT 2340 

GGAATGAAAA AIGGAGG6CA GGAAACCATT 2400 

GAATCCTGCC GGGGGGCTGG GCATCATCAT 2460 

GAGGTGGACA ACTGCAGATA CACTTACT06 2520 

GGTGAAAAAT TGCATCGATG TAATCAGAAT 2580 

CTCACTTATA ACTATGAGGG AAGAGGATCT 2640 

AAGCAGGAAG AAGATGGCCT TGACTTTTTA 2700 

GCAGAAGCAT GCACAAAGAG ATAATGTCAC 2760 

CTGQAGGTTT CCAAAAATAA TATTGTAAAG 2820 

TTTrrrCTCA ATTTTGAATT ATGCTACTCA 2880 

TTATCTTTTC CAAAAAGTGA AAAATGTTAA 2940 

CACTOGAATT AAGGTCTCTA AAGCATCTGC 3000 

TAAATATGCT GGATAAATAT TAGTCCAACA 3060 

ATGTATTCAC TTTAAGTGAT AGTTTAAAAA 3120 

GAAGAAAGTT TTGGAAAAGA AACAATGAAG 3180 

TAAAGAATTG G6ACTCACCC CTACTGCACT 3240 

GTGTTGAAGT GCCCTATGAA GTAGCAATTT 3300 

TGTGTGTATA TTATTATTAA TCAATGCAAT 3360 

ATGGTAAAAA CTTGAAATGA GGCTGGGGTA 3420 

GCTTCCTAGG OCTGGGCTCT TAAATGCTGC 3480 

CTGTCCAATT TGTGTAATTT GTTTAAAATT 3540 

GGGAAGGAAA TAGGGAATCC AATQGAACAG 3600 

TCTGCATCCA CAAGTTAGTA GCAAACTGGG 3660 

TTTTGGTAGC AAGGGTCCAG AGATGAGGTG 3720 

TTTAAAACTT ACCTTTACTG AAGTTAAATC 3780 

TGACCAACAT CTTTTTAATT TAGATCGAAA 3840 

GAGGGAGCIG A6GGGAGGAT CTTACTGAAA 3900 

CTAAGCCCCA CAAACTTGAC ACCTGATCAG 3960 

CTCACTGCCC TTCTTCTGAG TGGCATTCGC 4020 

CCCCTTCTTT CGGCTTTCTG CTAAAGCAAC 4080 

CAGGTTTTCC ACCATCCTTC AGOGTGAATT 4140 

AATTTTAAAA TAATAGAAGA AATAGAAATT 4200 

TCATTTTAGA ACAGAGGGAA CTTTGGGAGA 4260 

AGAGGGCAAC AG6AAGATGC AGGCCTTCAA 4320 

GGGAGTAAAA GCAACATCGT CTGCTTCATA 4380 

TTTCTCAGGC CAATG6CAAC rGCCATTTGA 4440 

TATGGCTCAC CTTATTTGGA GTGAGAAATC 4500 

TGAAGGCATT TGCAGGATGA GCCTGAACTG 4560 

AATTGTTGTA TTCCTTCTGC AGCOCTCCTT 4620 

GCCTATCTAA AATTCTGATT TATTCCTACA 4680 

TATGTGTTTT AGACTTAGAC TTTTTATTGC 4740 

TCGCTCTGAC GCACAGGCTG GAGTGCAGTG 4800 

CTCCCGGGTT CATGCCATTC TCCTGCCTCA 4860 

CACCACCACG CCOGGCTAAT TTTTTGTATT 4920 

CCAGGATGGT CTOGATCTCC TGACCTC3GTG 4980 

GGATTACAGG CATGACCCAC 0GCTCCCX3GC 5040 

AATGTAATCA TTTTGAACAT GTGTGAAAGT 5100 

ACTCAACCAA AAGACAGTCG AGAAGCCAGG 5160 

GTCTCAGAAT GGAATTCTCT GTAAGCCTAG 5220 

CAGTTTTATC TAACGGCTAC TGAAACACCC 5280 

ATCAAAACCT GCTACCTCCC CAAGACTTTA 5340 

CCAGTATCAC TTCCCTGTTT ATAAAACCTC 5400 

AAAACCACCT GGTCTGCATG TATGCCCGAA 5460 

TAATTTTAGG GATTCATTTC TATATTTTCA 5520 

GTAAGGTGAA ATTTATGGTA TTTGAfiTGTG 5580 

TTCCCCCAGT GAATGATTTA GAATTTTTTA 5640 

TTATAAGGAA GCAGCTGTCT AAAATGCAGT 5700 

TTTTAGTATT GCTATTAAAA GAAGTTACTT 5760 

AAGCAAAAAT TGGATGCATA AAGTAATATT 5820 

ATTAACTTGG y rr CT avm ' TTGCTGTATT 5880 

TTGCAAAATT AT6CTTATG6 CTGGCATGGA 5940 

ATTAATGGGG AATATTTTGG ACAATGTTTC 6000 

ATTGTAATGT TGGGAAGAGA TCACTATTTT 6060 

ATACATATGT ATAATAAATT TTGATOGGGT 6120 

CAGAGTATTC CATGAATAGT ACACTGACAC 6180 

AAGGGAAAAC ATGAGTTAAA AAGAAAAGCA 6240 

ACAGGATAGA TAATGCCTGA ACTTTAATGA 6300 

GCTTCACAGT GAATCTTTTC CCCATGCAGQ 6360 

TCATTTCAAA AATCTATTAG CTATATCAAA 6420 

AATTTCAATT CXaGTAACTT CTATTGTAAC 6480 

TCATTGGATT TTTGTTTGTA ATAGTAAAAT 6540 

TGATTTGGTT GAATATTGGG TCATAATGGT 6600 

TTGGATATGA ATCCTGGATC TGTCACTTAC 6660 

TCCTCTCTTA GCTTTCTCAT TAAAATCAAT 6720 

TGATTAAATT AGTTAATATA CCTAAAGTAC 6780 
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ATAGAACACT GCCTG C ACAT AGTAAAAGAA TTATAAGTGT GAGGTAGTTG GTAAAATTAT 6840 

GTAGTTCGAT ATACTACOSA ACAATATCTA ATCTCTTTTT AGGGAAATAA AGTTTGTGCA 6900 
TATATATAAT CCC3GAAACAT G 

Seq ZD NO: 32 Protein sequence: 
Protein Accession &: KP_001932.1 

1 11 21 31 41 51 

i I I I i 1 

MAAAGPHRSV RGAVCLHLLI. TLVIFSRDGB ACKKVILNVP SKLEADKIIG RVNLEBCFRS 60 

ADLIRSSDPD FHVIATXSSVY TARAVALSDK KRSFTIHLSD KRKQTQKEVT VLLEHQKKVS 120 

KTRHTRETVL RRAKRRWAPI PCSMQENSLG PFPLFWK3VE SDAAQNYTVF YSISGRGVDK 180 

EPUILFYIER DTGHIiPCTRP VDHEEYDVFD LIAYASTADG YSADLPLPLP IRVEDENDNH 240 

PVFTEAIYHP BVLESSRPGT TVGWCATDR DEPDTMHTRL KYSILQQTPR SPGIiFSVHPS 300 

TGVITTVSHY LDREWDKYS LIMKVQDMDG QFPGLIGTST CIITVTDSND MAPTFRQNAY 360 

EAFVEENAFN VEILRIPIED RDLINTAKHR VKFTILRCaiB KGHFRXSTDK ETNEGVLSW 420 

KPLNYEENHO VNLBIGVKNE APFARDIPRV TALMRALVTV HVBDLDEGPS CTPAAQWRI 480 

KENLAVGSKI NGYKAYDPEN RNGNGLRYKK LHDPKGWITI DEISGSIITS KIUIREVETP S40 

KNELYNITVL AIDKDDRSCT GTLAVNIEDV NDNPPBILQS YWICKPKMG YTDIIAVDPD 600 

EPVHGAPFYP SliPNTSPEIS RLWSLTKVND TAARLSYQKN AGFQEYTIPI TVKDRAGQAA 660 

TKhURVmtCS CTHPTQCRAT SRSTSVILGR HAZLAXLLGI ALLFSVLbTL V06VFGATKG 720 

KRFPEDLAQQ ITLIISNTEAP GDORVCSANO FKCQTISSSS Q6F0GTMGSG MKKGGQETIE 780 

KMZGGMQTLS SCRGAGHHUT U^SCBGGHTE V23NCRyTYSB WBSFXOPSLG EKLEROiattB 840 
08MPSQDYVL TYNYBSRaSP AGSVGCCSBK QEEDGIjDFX2I NLEPKFITIiA EACTKR 

Seg ID NO: 33 DMA sequezice 

Nucleic Acid Accession #: Bos sequence 

coding sequence: 64-2583 

1 11 21 31 41 SI 

I 1 I I I I 

GGCAG6TCTC GCTCTOSGCA CCCTCCOGGC GCC(XCGTTC TCCTGGCCCT GCCOSGCATC 60 

CCGATGGC03 CXXSCIOGGCC COGGOGCTCC GTGCGOGGAG CCGTCTGCCT 6CATCTGGTG 120 

CTGACCCTCG TGATCTTCAG TCGTGATGGT GAAGCCTGCA AAAAGGTGAT ACTTAATGTA 160 

CCTTCTAAAC TAGAGGCAGA CAAAATAATT GGCAGAGTTA ATTTGGAAGA GTGCTTCAGG 240 

TCTGCAGACC TCATCOGGTC AAGTGATCCT GATTTCAGAG TTCTAAATGA TGGGTCAGTG 300 

TACACAGCCA GGGCIGTTGC GCTGTCTGAT AAGAAAAGAT CATTTACCAT ATGGCTTTCT 360 

GACAAAAGGA AACAGACACA GAAAGAGGTT ACTGTGCTGC TAGAACATCA GAAGAAGGTA 420 

TCGAAGACAA GACACACTAG AGAAACTGTT CTCAGGCX3TG CCAAGAGGAG ATGGGCACCT 480 

ATTCCTTGCT CTATGCAAGA GAATTCCTTG GGCCCTTTCC CATTGrTTCT TCAACAAGTT 540 

GAATCTGATG CAGCACAGAA CTATACTGTC TTCTACTCAA TAAGTGGACG TG6AGTT6AT 600 

AAAGAACCTT TAAATTTGTT TTATATAGAA AGAGACACTG GAAATCTATT TTGCACTOGG 660 

CCTGTGGATC GTGAAGAATA TGATGTTTTT GATTTGATTG CTTATGCGTC AACTGCAGAT 720 

GGATATTCAG CAGATCTGCC CCTCCCACTA CCCATCAGGG TAGAGGATGA AAATGACAAC 780 

CACCCTGTTT TCACAGAAGC AATTTATAAT TTTGAAGTTT TGGAAAGTAC TAGACCTGGT 840 

ACTACAGIGG GGGTGGTTTG TGCCACAGAC AGAGATGAAC CGGA CACAA T GCATAGGGGC 900 

CTGAAATACA GCATTTTGCA GCAGACACCA AGGTCACCTG GGCTCTTTTC TGTGCATCCC 960 

AGCACAGGCG TAATCACCAC AGTCTCTCAT TATTTGGACA GAGAGGTTGT AGACAAGTAC 1020 

TCATTGATAA TGAAAGTACA AGACATGGAT GGCCAGTTTT TTGGATTGAT AGGCACATCA 1080 

ACTTGTATCA TAACAGTAAC AGATTCAAAT GATAATGCAC CCACTTTCAG ACAAAATGCT 1140 

TATGAAGCAT TTGTAGAGGA AAATGCATTC AATGTGGAAA TCTTACGAAT ACCTATAGAA 1200 

GATAAGGATT TAATTAACAC TGCC3VATTGG AGAGTCAATT TTACCATTTT A AAGG GAAAT 1260 

GAAAATGGAC ATTTCAAAAT CAGCACAGAC AAAGAAACTA ATGAAGGTGT TCTTTCrGTT 1320 

GTAAAGCCAC TGAATTATGA AGAAAACCGT CAAGTGAAOC TGGAAATTGO AGTA AACAAT 1380 

GAAGCGCCAT TTGCTAGAGA TATTCCCAGA GTGACAGCCT TGAACAGAGC CTTGGTTACA 1440 

GTTCATGTGA GGGATCTGGA TGAGGGGCCT GAATGCACTC CTGCAGCCCA ATATGTGCGG 1500 

ATTAAAGAAA ACTTAGCAGT GGGGTCAAAG ATCAACGGCT ATAAGGCATA TGACCCCGAA 1560 

AATAGAAATG GCAATGGTTT AAGGTACAAA AAATTGCATG ATCCTAAAGG TTGGATCACC 1620 

ATTGATGAAA TTTC3U3GGTC AATCATAACT TCCAAAATCC TGGATAGGGA 6GTTGAAACT 1680 

CCCAAAAATG AGTTGTATAA TATTACAGTC CTGGCAATAG ACAAAGATGA TAGATCATGT 1740 

ACTGGAACAC TTGCTGTGAA CATTGAAGAT GTAAATGATA ATCCACCAGA AATACTTCAA 1800 

GAATATGTAG TCATTTGCAA ACCAAAAATG GGGTATACXX3 ACATTTTAGC TGTTGATCCT 1860 

GATGAACCTG TCCATGGAGC TCCATTTTAT TTCAGTTTGC CJCAATACTTC TCCAGAAATC 1920 

AGTAGACTGT GGAGCCTCAC CAAAGTTAAT GATACAGCTG CCCGTCTTTC ATATCAGAAA 1980 

AAT6CTGGAT TTCAA6AATA TACCATTCCT ATTACTGTAA AAGACAGGGC CGGCCAAGCT 2040 

GCAACAAAAT TATTGAGAGT TAATCTGTGT GAATGTACTC ATCCAACTCA GTGT OGTG OG 2100 

ACTTCAAGGA GTACAGGAGT AATACTTGGA AAATGGGCAA TCCTTGCAAT ATTACTGGGT 2160 

ATAGCACTGC TCTTTTCTGT ATTGCTAACT TTAGTATGTG GA6TTTTTGG TGCAACTAAA 2220 

GGGAAACGTT TTCCTGAAGA TTTAGCACAG CAAAACTTAA TTATATCAAA CACAGAAGCA 2280 

CCTGGAGACG ATAGAGTGTG CTCTGCCAAT GGATTTATGA CCCAAACTAC CAACAACTCT 2340 

AGCCAAGGTT TTTGTGGTAC TATGGGATCA GGAATGAAAA ATGGAGGGCA GGAAACCATT 2400 

GAAATGATGA AAGGAGGAAA CCAGACXn'TG GAATCCTGCC GGGGGGCTGG GCATCATCAT 2460 

ACCCTG6ACT CCTGCAGGGG AGGACACAOG 6AGGTGGACA ACTGCAGATA CACTTACTCG 2520 

GAGTGGCACA GTTTTACTCA ACCCOGTCTC GGTGAAGAAT CCATTAGAGG ACACACTGGT 2580 

TAAAAATTAA ACATAAAAGA AATTGCATCG ATGTAATCAC AATGAAGACC GCATGCCATC 2640 

CCAAGATTAT GTCCTCACTT ATAACTATGA GGGAAGAGGA TCTCCAGCTG GTTCTGTGGG 2700 

CTGCTGCAGT GAAAAGCAGG AAGAAGATGG CCTTGACTTT TTAAATAATT TGGAACCCAA 2760 

ATTTATTACA TTAGCAGAAG CATGCACAAA GAGATAATGT CACAGTGCTA CAATTAGGTC 2820 

rrrGTCAGAC ATTCPGGAGG TTTCCAAAAA TAATATTGTA AAGTTCAATT TCAACATGTA 2880 

TGTATATGAT GATTTTTTrC TCAATTTTGA ATTATGCTAC TCACCAATTT ATATTTTTAA 2940 

AGCCAGTTGT TGCTTATCTT TTCCAAAAAG TGAAAAATGT TAAAACAGAC AACTGGTAAA 3000 

TCTCAAACTC CAGCACTGGA ATTAAGGTCT CTAAAGCATC TGCTCTTTTT TTTTrTTACG 3060 

GATATTTTAG TAATAAATAX GCTGGATAAA TATTAGTCCA ACAATAGCTA AGTTATGCTA 3120 

ATATCACATT ATTATGTATT CACTTTAAGT GATAGTTTAA AAAATAAACA AGAAATATTG 3180 

AfiTATCACTA TGTGAAGAAA GTTTTGGAAA AGAAACAATG AAGACT6AAT TAA ATTA AAA 3240 

ATGTTGCAGC TCATAAAGAA 7TGGGACTCA COCCTACTGC ACTACCAAAT TCATTTGACT 3300 
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TTGGAGGCAA AATGTGTTGA AGTGCXCTM GAAGTAGCAA TTTTCTATAG GAATATAiGTT 3360 

GGAAATAAAT CTAVIT^'iGTOT ATATTATTAT TAATCAATGC AATATTTAAA ATGAAATGAG 3420 

AACAAAGA06 AAAATGGTAA AAACTTGAAA TGAOGCIGGG GTATAGTTTG TCCTACAATA 3480 

GAAAAAAGAC AGAGCTTCCT AGGOCTGGGC TCTTAAASGC TGCATTATAA CTGAGTCTAT 3540 

GACGAAATAG TTCCTGTOCA ATTTGTGTAA TTTGTTTAAA ATTGTA AATA A ATTAAA CTT 3600 

TTCrOGTTTC TGTCGGAA£3G AAATAGGC3AA TCCAATGGAA CAGTAGCTTT 6CTTTGCA6T 3650 

CTGTTTCAAG ATTTCTGCAT CCACAACTTA GTAGCAAACT GGGGAATACT OGCTGCAGCT 3720 

GGGGTTCCCT GCTTTrPGGT AGCAAGGGTC CAGAGATGAG GTGTTTTTTT CGGGGAGCTA 3780 

ATAACAAAAA CATTTTAAAA CTEACCTTTA CTGAAGTTAA ATCCTCTATT GCTGTTTCTA 3840 

TTCTCTCTTA TAGTGACCAA CaTCTTTTTA ATTTAGATOC AAATAACCAT GTOCTCCTAG 3900 

AGTTTAGAGG CTAGAGGQAG CTGAGGGGAfi GATCTTACTG AAAGCACCC7 GGGGAGATTG 3960 

ATTCTCCTTA AACCTAAGOC CCACAAACTT GACACXrTGAT CAGGTCTOGG AGCTACAAAA 4020 

TTTCATTTTT CTCCTCACTG CCCTTCTTCT GAGTGGCATT GGCCTGAATC AAGGAAAGCC 4080 

AGGCCTTGTG GGCCXXCTTC TTTCX5GCTTT CTGCTAAAGC AACACCTCCA GCAGA GATT C 4140 

CCTTAAGTGA CTCCAGGTTT TCCACCATCC TTCAGOCSTGA ATTAATTTTT AATCAGTTTG 4200 

CTTTCTCCAG AGAAATTTTA AAATAATAGA AGA AATA GftA ATTTTGAATG TATAAAACAA 4260 

AAAGATCAAG TTCTCATTTT AGAACAGAGG GAACTTTGGG AGAAAGCAGC CCAAGTAG6T 4320 

TATTTGTACA GTCRGAGGGC AACAGGAAGA TGCAGGCCTT CAAGGGCAAG GAGAG GCXaC 4380 

AAGGAATATG GGTGGGAGTA AAAGCAACAT CGTCTGCITC ATACTTTTTC CTAGGCTTGG 4440 

CACTGCCTTT TCCTTTCTCA GGCCAATGGC AACTGCCATT TGAGTCCGGT GAGGGATCAG 4S00 

GCAACCTCTT CTCTATGGCT CACCTTATTT GGAGTGAGAA ATCAAGGAGA CAGAGCTGAC 4560 

T6CATGATGA GTCTGAAGGC ATTTGCAGGA TGAGCCTGAA CTGGTTGTGC AGAACAAACA 4620 

AGGCATTCAT GGGAATTGTT GTATTCCTTC TGCAGCCCTC CTTCTGGGCA CTAA GAAGGT 4680 

CTATGAATTA AATGCCTATC TAAAATTCTG AITTATTCCT ACATTTTCIG TT TTCTAATT 4740 

TCACCCTAAA ATCTATGTGT TTTAGACTTA GACTTTTTAT TGCCCOCCCC CXXOTTTTTT 4800 

TTCAGAOGGA GTCTOGCTCT GACGCACAGG CPGGAGTGCA GTGGCTCCGA TCTCTGCTCA 4860 

CTGAAAGCTC CGCCTCCOGG GTTCATGCCA TTCTCCTGCC TCAGCCTCCT GAGTAGCTGG 4920 

GACTACAGGC GCCCACCACC AOGCCCGGCT AATTTITTCT ATTTTTAATA GAGACGGGGT 4980 

TTCACTGTGT TAGCCAGGAT GGTCTOGATC TCCTGACCTC GTG ATCCGCC T GOCTOGG CC 5040 

TCCCAAAGTG CTGGGATTAC AGGCATGACC CACOGCTCCC GGCCTTGTTT TCCGTTTAAA 5100 

GTCOTCTTCr TTTAATGTAA TCATTTTGAA CATGTGTGAA AGTTGATCAT AOGAATTQGA 5160 

TCAATCTTGA AATACTCAAC CAAAAGACAG TCGAGAAGCC AGGGGGAGAA AGAACTCAGG S220 

GCACAAAATA TTGGTCTGAG AATGGAATTC TCTGTAAGCC TAGTTGCTGA AATTTCCTGC 5280 

TGTAACCAGA AGCCAGTTTT ATCTAACGGC TACTGAAACA CCO^CTGTGT TTTGCTCACT 5340 

CCCACTCACC GATCAAAAOC TGCTACCTCC CCAAGACTTT ACTAGTGCOG ATAAACTTTC 5400 

TCAAAGAGCA ACCAGTATCA CTTCCCTGTT TATAAAACCT CTAACCATCT CTTTG rrCri 5460 

TGAACATGCT GAAAACCACC TGGTCTGCAT GTATGCCCGA ATTTGTAATT CTTTTCTCTC 5520 

AAATGAAAAT TTAATTTTAG GGATTCATTT CTATATTTTC ACATATGTAG TA TTATTA TT 5580 

TCCTTATATG TGTAAGGTGA AATTTATGGT ATTTGAGTGT GCAAGAAAAT ATATTTTTAA 5640 

AGCTTTCATT TTTCCCCCAG TGAATGATTT AGAATTTTTT ATGTAAATAT ACAGAATGTT 5700 

TTTTCTTACT TTTATAAGGA AGCAGCTGTC TAAAATGCAG TGGGGTTTGT TTTGCAATGT 5760 

TTTAAACAGA GTTTTAGTAT T6CTATTAAA AGAAGTTACT TT6CTTTTAA AGAAACTTGG 5820 

CIGCTTAAAA TAAGCAAAAA TTGGATGCAT AAAGTAATAT TTACAGATGT GGGGAGAT6T 5880 

AATAAAACAA TATTAACTTG GCTGCTTAAA ATAAGCAAAA ATTGGATGCA TAAAG TAATA 5940 

TTTACAGATG TGGGGAGATG TAATAAAACA ATATTAACTT GGTTTCTTGT TTTTGCTGTA 6000 

TTTAGAGATT AAATAATTCT AAGATGATCA CTTTGCAAAA TTATGCTTAT GGCTGGCATG 6060 

GAAATAGAAA TACTCAATTA TGTCTTTGTT GTATTAATGG GGAATATTTT GGACAATGTT 6120 

TCATTATCAA ATTGTCGACA TCATTAATAT ATATTGTAAT GTTGGGAAGA GATCACTATT 6180 

TTGAAGCACA GCTTTACAGA TCAGTATCTA TGATACATAT GTATAATAAA TTTTGATCGG 6240 

GTATTAAAAG TATTAGAAOG TGGTTATAAT TGCAGAGTAT TCCA7GAATA GTACACTGAC 6300 

ACAGGGGTTT TACTTTGAGG ACCAGTGTAG TCAAGGGAAA ACAT6AGTTA AAA AGAA AAG 6360 

CAGGCAATAT TGCAGTCTTG ATTCTGCCAC TTACAGGATA GATAATGCCT GAACTTTAAT 6420 

GACAAGATGA TCCAACCATA AAGGTGCTCT GTGCTTCACA GTGAATCTTT TCCCCATGCA 6480 

GGAGTGTGCT COCXn-ACAAA CS3TTAAGACT GATCATTTCA AAAATCTATT AGCTATATCA 6540 

AAAGCCTTAC ATTTTAATAT AGGTTGAACC AAAATTTCAA TTCCRGTAA C TTCTATTGTA 6600 

ACCATTATTT TTGTGTATGT CTTCRAGAAT GTTCATTGGA TTTTTGTTTG TAATAGTAAA 6660 

ATACCGGATA CATTTCACGT GTCCTTCAGT ATTGATTTGG TTGAATATTG GGTCATAATG 6720 

GTTGAGAAGC ATGGACACTA GAGCCAGAAT GCTTGGATAT GAATCCTGGA TCTGTCACTT 6780 

ACTTCTGTGT GACCTTTGAA AGGCTACTTA TTTCCTCTCT TAGCTTTCTC ATTAAAATCA 6840 

ATGAACAATG CCAGCCTCAT GGGGTTGTTG AATGATTAAA TTAGTTAATA TACCTAAAGT 6900 

ACATAGAACA CTGCCTGCAC ATAGTAAAAG AATTATAAGT GTGAG6TAGT TGG TAAAA TT 6960 

ATGTAGTTGG ATATACTACC GAACAATATC TAATCTCTTT TTAGGGAAAT AAAGTTPGTG 7020 
CATATATATA ATCCGGAAAC ATG 



Seq 10 NO: 34 Protein sequence: 
Protein Accession #: MP_07774l.l 

I 11 21 31 41 51 

I I I I 1 ! 

MAAAGPRRSV RGAVCLHLLL TLVIPSRDGE ACKKVILNVP SKLEADXIIG RVNLEECFRS 60 

ADLIRSSDPD FRVUIDGSVY TARAVALSDK KRSFTIWLSD KRKQTQKEVT VLLEHQKKVS 120 

KTRHTRETVL RRAKRRWAPI PCSMQENSLG PPPLPLQQVE SDAAQNTrVF YSISGRGVDK 180 

EPLNLPYIER DTGNLPCTRP VDREEYDVFD LIAYASTADG VSADLPLPLP IRVEDENDNH 240 

PVPTEAIYNP EVLESSRPGT TVGWCATDR DEPDTMHTRL KYSILQQTPR SPGLFSVHPS 300 

TGVITTVSHy LDREVnmKYS LIMKVQDKDG QFFGLIGTST CIITVTDSND NAPTPRQNAY 360 

EAFVEEHAFN VEILRXPIED KDLINTAMHR VMFTILIQCaiE N^FKISTOK ETMBGVLSW 420 

KPLNYEENRQ VNLEIGVNNE APFARDIPRV TAiNRALVTV HVRDLDB6PB CTPAAQWRI 480 

KENLAVGSKI NGYKAYDPEN RNGNGLRYKK LHDPKGWITI DBISGSIITS KILDREVETP 540 

KNELYNITVL AIDKDDRSCT GTLAVNIEDV NDNPPEILQE YWICKPKKG YTDILAVDPD 600 

BFVHGAPPYP SIiPNTSPBIS RLHSLTKVim TAARLSYQKN AGFQEYTIPI TVKDRAGQAA 660 

TKLLRVNU:E CTHPTQCRAT SRSTGVILGK HAIIAXLL6I ALLPSVLIiTI. VOGVFGATKG 720 

KRFPEDLAQQ MLXISNTEAP GODRVCSANG F»1TQTTNNSS QGFOGTMGSG MKNGGQETIE 780 
iWKGGSTTLB SCRGAGHKHT LOSCRGGBTS vmCBYTTSB KH5FTQPRZ/3 EESIRGHTG 



Seq ZD NO: 35 DNA sequence 

Nucleic Acid Accession (f: Eos sequence 

Coding sequence: 146-1273 > 



202 



wo 02/086443 



1 11 21 31 41 51 

1 1 I 1 I 1 

GGGAGTGGGC GTGGOCSGTGC TGCOCAGGTG AGCOVCOQCT GCTTCTGCCC AGA CAOGGTC 60 

GCCTCCACAT CCAGGTCTTT GTGCTCCTCG CTTGCCTGTT CCTTTTCCAC GCATTTTCCA 120 

GGATAACTGT GACTCCAGGC TGCCCTGCAA CTAGCAAATT CGGClVnGC 180 

CGTTGATCTG TTCAAACAAC TATGTGAAAA GGAGGCACTG GGCAATGTCC TCTTCTCTCC 240 

AATCXGTCTC TCCaCCTCTC TGTCACTtGC TCMGTGGGT GCTA AAGGTG A CACTGC AAA 300 

TGAAATTGGA CAGGTTCTTC ATTTTGAAAA TGTCAAAGAT ATACOCTTTG GATTTCAAAC 360 

AGTAACATCG GATGTAAACA AACTTAGTTC CTTTTACTCA CTGAAACTAA TCAAGOGGCT 420 

CTAOGTAGAC AAATCTCTGA ATCTTTCTAC AGAGTTCATC AGCTCTAOGA AGAGACX:CTA 480 

TGCAAAGGAA TTGGAAACTG TTGACTTCAA AGATAAATTG GAAGAAACGA AAGGTCAGAT 540 

CAACAACTCA ATTAAGGATC TCACAGATGG OCACTTTGAG AACATTTTAG CTGACAACAG 600 

TGTGAAGGAC CAGACCftAAA TCCTTGTGGT TAATGCTGCC TACTTTGTTG GCAACTGGAT 660 

GAA(3AAATTT CCTGAATCAG AAACAAAABA ATGTCCTTTC AGACTCAACA AGACAGACAC 720 

CAAACCAGTG CAGATGATGA ACATGGAGGC CaCXjTTCTGT ATGGGAAACA ITGACAGTAT 780 

CAATTGTAAG ATCATAGAGC TTCCTTTTCA AAATAAGCAT CTCAGCATGT TCATOCTACT 840 

ACCCAAGGAT GTGGAGGATG AGTCCACAGG CTTGGAGAAG ATTGAAAAAC AACTCAACTC 900 

AGAGTCACTG TCACAGTGGA CTAATCCCAG CACCATGGCC AATGCCAAGG TCAAACTCTC 960 

CATTCCAAAA TTTAAGGTGG AAAAGATGAT TGATCCCAAG GCTTGTCTGG AAAATCTAGG 1020 

GCT6AAACA7 ATCTTCAGTG AAGACACATC TGATTTCTCT GGAATGTCAG AGAGCAAGOG 1080 

AGTGGCCCTA TCAAATGTTA TCCACAAAGT GTGCTTAGAA ATAACTGAAG ATGGTGGGGA 1140 

TTCCATAGAG GTGCCAGGAG CACGGATCCT GCAGCACAAG GATGAATTGA ATGCTGACCA 1200 

TCCCTTTATT TACATCATCA GGCACAACAA AACTOGAAAC ATCATTTTCT TTGGCAAATT 1260 

CTGTTCTCXrr TAAGTGGCAT AGCCCATGTT AA6TCCTCCC TGACTTTTCT GTGGATGOOG 1320 

ATPTCTOTAA ACTCTGCATC CAGAGATTCA TTTTCTAGAT ACAATAAATT GCT AATGTTG 1380 

CTGGATCAGG AAGCOGCCAG TACTTGTCAT ATGTAGCCTT CACACAGATA GACCTTTTTT 1440 

TTTTTCCAAT TCTATCTTTT GTTTCCTTTT TTCCCATAAG ACAATGACAT ACGCTTTTAA 1500 

TGAAAAGGAA TCACGTTAGA GGAAAAATAT TTATTCATTA TTTGTCAAAT TGTCG6GGGT 1560 

AGTTGGCAGA AATACAGTCT TCCACAAAGA AAATTCCTAT AAGGAAGATT TGGAAGCTCT 1620 

TCTTCCCAGC ACTATGCTTT CXmCTTUGG GATAGAGAAT GTTOCaGACA TTCTCGCTTC 1680 

CCTGAAAGAC TGAAGAAAGT GTAGTGCATG GGACCCAOGA AACTGCCCTG GCTCCAGTGA 1740 

AACTTGGGCA CATGCTCAGG CTACTATAGG TCCAGAAGTC CTTATGTTAA GCCCTGGCAG 1800 

GCAGGTGTTT ATTAAAATTC TGAATTTTGG GGATTTTCAA AAGATAATAT TTTACATACA 1860 

CTGTATGTTA TAGAACTTCA TGGATCAGAT CTGGGGCAGC AACXTTATAAA TCAACAOCTT 1920 

AATATGCTGC AACAAAATGT AGAATATTCA GACAAAATGG ATACATAAAG ACTAAGTAGC 1980 

CCATAAGGGG TCAAAATTTG CTGCCAAATG CGTATGCCAC CAACTTACAA AAACACTTCG 2040 

TTCGCAGAGC TTTTCAGATT GTGGAATGTT GGATAAGGAA TTATAGACCT C TAGTA GCTG 2100 

AAATGCAAGA CCCCAAGAGG AAGTTCAGAT CTTAATATAA ATTCACTTTC ATTTTTGATA 2160 

GCTGTCCCAT CTGGTCATGT GGTTGGCACT AGACTG6TGG CAGGGGCTTC TAGCTGACTC 2220 

GCACAGGGAT TCTCACAATA GCCGATATCA GAATTTGTGT TGAAG6AACT TGTCTCTTCA 2280 

TCTAATATGA TAGCGGGAAA AGGAGAGGAA ACTACTGCCT TTAGAAAATA TAAGTAAAGT 2340 

GATTAAAGTG CTCACGTTAC CTTGACACAT AGTTTTTCAG TCTATGGGTT TAGTTACTTT 2400 

AGATGGCAAG CATGTAACTT ATATTAATAG TAATTTGTAA AGTTGGGTGG ATAAG CTATC 2460 

CCTGTTGCCG GTTCATGGAT TACTTCTCTA TAAAAAATAT ATATTTACCA AAAAATTTTG 2 520 

TGACATTCCT TCTCCCATCT CTTCCTTGAC ATGCATTGTA AATAGGTTCT TCTTGTTCTG 2580 
AGATTCAATA TTOAATTTCT CCTATGCTAT TGACAATAAA ATATTATTGA ACTAOC 

Seq ID NO: 36 Protein sequence: 
Protein Accession #: NP_002630.1 

1 11 21 31 41 51 

I I i I I ( 

KDALQIiANSA FAVDLPKQLC EKEPLGNVLP SPICLSTSLS LAQVQAKGDT ANEIGQVLHF 60 

ENVKDIPPGP QTVTSDVNKL SSFYSUCLIK RLYVDKSLNL STBFISSTKR PYAKELETVD 120 

FKDKLEETKG QINNSIKDLT DGHFENILAD NSVNDQTKIIi WNAAYFVGK WMKKPPESET 180 

KECPFRIiNKT DTKPVQMMNM EATFCMGNID SINCKIIELP PQNKHLSMFI IiLPKDVEDES 240 

TGLEKIEKQL NSESLSQWTN PSTMANAKVK LSIPKFKVEK MIDPKACLEN U3LKHIPSED 300 

TSDFSGMSET KGVALSNVIH KVCLEITEDG GDSIEVFGAR ILQHKDEUJA DHPFIYIIRH 360 
NKTRHIIFFG KFCSP 

Seq ID NO: 37 DNA sequence 

Nucleic Acid Accession #: NM_0168583 

Coding sequence: 72-842 

1 11 21 31 41 51 

1 I j i i i 

GGAGTGGGGG AGAGAGAGGA GACCAGGACA GCTGCTGAGA CCTCTAAGAA GTCCAGATAC 60 

TAAGAGCAAA GATGTTTCAA ACTGGGGGCC TCATTGTCTT CTACGGGCTG TTAGCCCAGA 120 

CCATQGCCCA GTTTCGAGGC OVCCCCGTGC OCCTGGACCA GACCCTGCCC TTGAATGTGA 180 

ATCCAGOCXrr GCCCTTGAGT CCCACAGGTC TTGCAGQAAG CTTGACAAAT GCCCTCAGCA 240 

ATGGCCTGCT GTCTGGGGGC CTGTTGGGCA TPCTGGAAAA CCTTOOGCTC CTGGACATCC 300 

TGAAGCCTGG AGGAGGTACT TCTGGTGGOC TCCTTG6GGG ACTGCTTGGA AAAGTGAOGT 360 

CAGTGATTCC TGGCCTGAAC AACATCATTG ACATAAAGGT CACTGACCCC CAGCTGCTGG 420 

AACTTGGCCT TGTGCAGAGC CCTGATGGCC ACCGTCTCTA TGTC31CCATC CCTCTCGGCA 480 

TAAAGCTCCA AGTGAATACG CCCCTGGTCX; GTGCAAGTCT GTTGAGGCTG GCTGTGAAGC 540 

TGGACATCAC TGCAGAAATC TTAGCTGTGA GAGATAAGCA GGAGAGGATC CACCTGGTCC 600 

TTGGTGACTG CACCCATTCC CCTGGAAGCC TGCAAATTTC TCTGCTTGAT GGACTTGGCC 660 

CCCTCCCCAT TCAA6GTCTT CTGGACAGCC TCACAGGGAT CTTGAATAAA GTCCTGCCTG 720 

AGTTGGTTCA GGGCAACGTG TGCCCTCTGG TCAATGAGGT TCTCA6AGGC TTGGACATCA 780 

CCCTGGTGCA TGACATTGTT AACATGCTGA TCCACGGACT ACAGTTTGTC ATCAAGGTCT 840 

AAGCCTTCCA GGAAGGGGCT GGCCTCTGCT GAGCTGCTTC CCAGTGCTCA CAGATGGCTG 900 

GCCCATGTGC TGGAAGATGA CACAGTTGCC TTCTCTCCGA GGAACCTGCC CCCTCTCCTT 960 

TGCCACCAGG 0GTGT6TAAC ATCCCATGTG CCTCACCTAA TAAAATGGCT CTTCTTCTGC 1020 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAA 
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10 
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Seq ID KOi 38 Protein sequence : 
Protein Accession ft: NP_057667 

1 11 21 31 41 51 

) 1 ! ) 1 i 

MFQTXSGLrVF YGLLAQTMAQ FGGLPVPLDQ TLPLHVNPAL PLSPTGLRGS LTNALSNGLL 
SGGLLGILEN LPUSIbKK GGTSGGUiCC LliGKVTSWIP OKKIIOIKV TDPQU/EiGIi 

vQSP«3ati.y vTtnaiXLQ vtmu/a^su lkuwkldit ABiiAviiDXQ stamm/sx 

THSPGSIjQIS UiDGbOFIiPI QGLLDSLTGI LMKVliPELVQ 6NVCPLVJJBV LRGLDITLVH 
DIVDMblHGti QTVIKV 



SO 
120 
180 
240 



PCT/US02/12476 



15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ZD NO: 39 DMA sequence 

nucleic Acid Accession ft: NM_004363.1 

Coding sequence: 115-2223 



CTCAGGGCAG 
TCCTGGAACT 
TCTCCCTCGG 
TCACTTCTAA 
TTCAATGTCG 
TTTGGCTACA 
GTAATAGGAA 
GCCMOGCAT 
CAOGTCATAA 
GAGCTGCCCA 
GTGGCCTTCA 
CAOAGCCrCC 
TTCAATGTCA 
GCCAGGCGCA 
TCCCCTCTAA 
TCTAACCCAC 
GAGCTCTTTA 
AACTCAGACA 
CCCAAACCCT 
TTAACCTGTG 
CTCCOGGTCA 
GTCACAAGGA 
CACAGCGACC 
TCATACACXrr 
CCACCTGCAC 
TTTATCTCCA 
GCCA6TGG0C 
CCCTCCATCT 
TGTGAACCTG 
GTCAGTCCCA 
AGAAATGACG 
GACCCAGTCA 
TOGTCTTACC 
CCXaCAGTATT 
GCCAAAATCA 
GGCOGCAATA 
CTCTCAGCTG 
TAGCAGCCCT 
TAAAGCATTT 
AGACTCTGAC 
AAATACAAAA 
TGAGGCAGGA 
ACTGCACTCC 
TCTGACCTGT 
AACTTTAATG 
TAATTAATTT 
TTOOCAGATT 
AAATATACTT 
AGACTTGG6A 
TCAATAAAAA 



11 
I 

AGGGAGGAAG 
CAAGCTCTTC 
CCCCTCCCCA 
CCTTCTGGAA 
CAGAGGGGAA 
GCTGGTACAA 
CTCAACAAGC 
GCCTGCTGAT 
AGTCAGATCr 
AGCCCTCCAT 
CCTGTGAACC 
OGGTCAGTCC 
CAAGAAATGA 
GTGATTCAGT 
ACaCATCTTA 
CTGCACAGTA 

CTGGCCTCAA 
TCATCACCAG 
AACCTGAGAT 
GTCCCAGGCT 
ATGATGTAGG 
CAGTCATCCT 
ATTACCGTCC 
AGTATTCTTG 
ACATCACTGA 
ACAGCAGGAC 
GCAGCAACAA 
AGGCTGAGAA 
GGCTGC3VGCT 
CAAGAGCCTA 
CCCTGGATGT 
TTTOGGGAGC 
CTTGGOGTAT 
GGCCAAATAA 
ATTCCATAGT 
GGGCCACTGT 
GGTGTAGTTT 
GCAACAGCTA 
CAGAGATCGA 
ATGAGCTGGG 
GAAT06CTTG 
AGTCTGGCAA 
ACTCTTGAAT 
AACTAACTGA 
CATGGGACTA 
TCAGGAAACT 
TTGTGAACAA 
AACTATTCAT 
TCTGCTCTTT 



21 
I 

GACAGCAGAC 
TCCACA6AGG 
CAGATGGTGC 
COOSCCCACC 
GGAGGTGCTT 
AGGTGAAAGA 
TACCCCAGGG 
GCAGAACATC 
IGTGAATGAA 
CTCCAGCAAC 
TGAGACTCAG 
CAGGCTGCAG 
CACAGCAAGC 
CATCCTGAAT 
CAGATCAGGG 
CTCTTQGTTT 
CACTGTGAAT 
TAGGACCACA 
CAACAACTCC 
TCAGAACACA 
GCA6CTGTCC 
ACCCTATGAG 
GAATGTCCTC 
AGGGGTGAAC 
GCTGATTGAT 
GAAGAACAGC 
TACAGTCAAG 
CTCCAAACCC 
CACAACCTAC 
GTCCAATGGC 
TGTATGTGGA 
CCTCTATGGG 
GAACCTCAAC 
CAATGGGATA 
TAACGGGACC 
CAAGAGCATC 
CGGCATCATG 
CTTCATTTCA 
CAGTCTAAAA 
GACCATCCTA 
CTTGGTX3G0G 
AA0C06GGAG 
CAGAGCAAGA 
ACAAGTTTCT 
CAGCTTCATG 
AATGAACTAA 

I ' rrrrrcrrf 

AAATTGAGAC 
6AATATTTAT 
GTATAACAGA 



31 
1 

CAGACAGTCA 
AGGACAGAGC 
ATCCCCIGGC 
ACTGCCAAGC 
CTACTTGTCC 
GTGGATGGCA 
CCCX5CATACA 
ATCCAGAATG 
GAA6CAACTG 
AACTCCAAAC 
GACGCAACXn* 
CTGTCCAATG 
TACAAATGTG 
GTCCTCTATG 
GAAAATCTQA 
GTOVATGGGA 
AATAGTGGAT 
GTCAOGACGA 
AACCCOGTGG 
ACCTACCTGT 
AATGACAACA 
TGTGGAATCC 
TATGGCCCAG 
CTCAGCCTCT 
GGGAACATCC 
GGACTCTATA 
ACAATCACAG 
GTGGAGGACA 
CTGTG6TGGG 
AACAGGACCC 
ATCCAGAACT 
COGGACACCC 
CTCTCCTGCC 
CC6CAGCAAC 
TATGCCTGTT 
ACAGTCTCTG 
ATTGGAGTGC 
GGAAGACTGA 
TTGCTTCTTT 
GCCAACATCG 
CX3CACCTGTA 
GTGGAGATTG 
CTCCATCTCA 
QATACXa^CTG 
AAACTGTCCA 
TGAGGATTGC 
TAAGCTATCC 
ATTTACATTT 
ATTGTATGGT 



41 
I 

CAGCAGCCTT 
AGACAGCAGA 
AGAOGCTGCT 
TCACTATTGA 
ACAATCTGCC 
ACOGTCAAAT 
GTGGTOGAGA 
ACACAGGATT 
GCCAGTTCCG 
CGGTGGAGGA 
ACCTGTGGTG 
GCAACAGGAC 
AAACCCAGAA 
GCCCGGATGC 
ACCTCTCCTG 
CTTTCCACCA 
CCTATAOGTG 
TCACAGTCTA 
AGGATGAGGA 
GGTGGGTAAA 
GGACCCTCAC 
AGAACGAATT 
ACGACCCCAC 
CCTGCXATGC 
AGCAACACAC 
CXTGCCAGGC 
TCTCTGCGGA 
AGGAT6CTGT 
TAAATGGTCA 
TCACTCTATT 
CAGTGAGTGC 
CCATCATTTC 
ACTCGGCCTC 
ACACACAAGT 
TTGTCTCTAA 
CATCTGGAAC 
TGGTTGGGGT 
CAGTTGTTTT 
ACCAAGGATA 
TGAAACCCCA 
GTOXAGTTA 
CAGTGAGCCG 
AAAAGAAAAG 
CACTGTCTGA 
CCAAGATCAA 
TGATTCTTTA 
ACTCTTACAG 
TCTCCCTATG 
AATATAGTTA 



51 
I 

GACAAAACGT 
GACCATGGAG 
GCTCACAGCC 
ATCCACGCOG 
CCAGCATCTT 
TATAGGATAT 
GATAATATAC 
CTACACCCTA 
GGTATACCCX3 
CAAGGAT6CT 
GGTAAACAAT 
CCTCACTCTA 
CCCAGTGAGT 
CCCCACC3VTT 
CCACGCAGCC 
ATCCACCCAA 
CCAAGCCCAT 
TGCAGAGCCyV 
TGCTGTAGCC 
TAATCAGAGC 
TCTACTCAGT 
AAGTGTTGAC 
CATTTCCCCC 
AGCCTCTAAC 
ACAAGAGCTC 
CAATAACTCA 
GCTGCCCAAG 
GGCCTTCACC 
GA6GCTCCCA 
CAATGTCACA 
AAACCGCAGT 
CCCCCCAGAC 
TAACCCATCC 
TCTCTTTATC 
CTTGGCTACT 
TTCTCCTGGT 
TGCTCTGATA 
GCTTCTTCCT 
TTTACAGAAA 
TCTCTACTAA 
CrOGGGAGGC 
AGATOGCACC 
AAAAGAAGAC 
GAATTTCCAA 
GCAGAGAAAA 
AATGTCTTGT 
CAATTTGATA 
TGGTCGCTCC 
TTGCACAAGT 



seq ID »0: 40 Protein sequence: 
Protein Accession ft: NP_004354.1 



MESPSAFPHR 
HLFGYSWYKG 
TIiMVIKSDLV 
NIIQSZ.PVSPR 
TISPLNTSYR 
AHNSDTGLNR 
QSLPVSPRIiQ 
SPSYTYYRPG 
NSASGHSRTT 
LPVSPRLQLS 
PDSSYLSGAN 
A3GRKNSIVK 



11 
1 

WCIPWQRLUi 
ERVDGNRQII 
NBEATGQFRV 
LQLSNGKBTL 
SGEKUILSCR 
TTVTTITVYA 
LSNDNRTLTL 
VNLSLSCHAA 
VKTITVSAEL 
NGNRTLTLFN 
LNLSCHSASN 
SITVSASGTS 



21 
1 

TASLLTFWNP 
GYVIGTQQAT 
YPELPKPSIS 
TLFNVTRNDT 
AASNPPAQYS 
BPPKPFITSU 
LSVTRNDVGP 
SNPPAQYSWL 
PKPSISSNNS 
VTEITOARAYV 
PSPQYSWRIN 
PGLSAGATVG 



31 
I 

PTTAKLTIBS 
PGPAYSGRBI 
SNNSKPVEDK 
ASYKCXTQNP 
WFVNGTFQQS 
NSlfPVEDEDA 
YECGIQNELS 
IDGNIQQHTQ 
KPVEDKDAVA 
CX5IQNSVSAN 
GIPQQHTQVL 
IMXGVLVGVA 



41 
I 

TPFNVAEGKE 
lYPNASLLIQ 
DAVAFTCEPE 
VSARRSDSVI 
TQBLPIMItT 
VALTCEPEIQ 
VDHSDPVILN 
ELFISNITEK 
FTCEPEAQNT 
RSDPVTLDVL 
FIAKITPMJOJ 
LI 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 



51 
I 

VLLLVHNLPQ 
NIIQNDTGFY 
TQDATYLWWV 
mVLYGPDAP 
VKNSGSYTCQ 
MTTYLWWVNN 
VLYGPDDPTX 
NSGLYTOQAN 
TYLWWVNGQS 
YGPDTPIISP 
GTYACFVSNL 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



204 



wo 02/086443 



Seq ID KO: 41 OWV sequence 

Nucleic Acid Accession §: 1JM_00S952,1 

coding sequence: 11-793 

1 11 21 31. 41 SI 

I ' I I I I I 

AATCOOSACA ATGGOGAAAG ACAACTCAAC TGTTOGtTGC TTCCAGGGCC TGCTGATTTT 60 

TGGAAATGTG ATTATTGGTT GTTGCGGCAT TGCCCTC5ACT GOQGAGTGCA TCTTCTrTGT 120 

ATCTCACCAA CACAGCCTCT ACCCACTGCT TX5AAGCCACC GACAAOGATG ACATCTATGG 180 

GGCTGCCTGG ATCGGCATAT TTGTGGGCAT CTGCCTCTTC TGCCTGTCTG TTCT AGGC AT 240 

TGTAGGCATC ATGAAGTCCA GCAGGAAAAT TCTTCTGGCG TATTTCATTC TGATGTTTAT 300 

AGTATATGCC TTTCAAjGrGG CATCTTGTAT CACAGCAGCA ACACAACGAG ACTTTTTCAC 360 

ACCCAACCTC TTCCTGAAGC AGATGCTAGA GAGGTACCAA AACAACAGCC CTCCAAACAA 420 

TGATCACCAG TGGAAAAACA ATGGAGTCAC CAAAACCTGG GACAGGCTCA TGCTCC ACGA 480 

CAATTGCTCT GGOGTAAATG GTCCATCAGA CTGGCAAAAA TACACATCTG CCTTCCGGAC S40 

TGAGAATAAT GATGCTGACT ATOCCTGGCC TOGTCAATGC TGTGTTATGA ACAATCTTAA 600 

AGAACXnxrrC AACCTGGAGG CTTGTAAACT AGGOnXKXrr GGTrrXTATC ACAATCAGGG 660 

CTCCTATGAA CTGATCTCTG CTCCAATGAA CCGACAOCCC TGGQGGGTTG CCTGGTTTGG 720 

ATTTGCCATT CTCTGCTGGA CTTrTTGGGT TCrCCTGGGT ACCATGTTCT ACPC3GAGCAG 780 
AATTGAATAT TAAGAA 



Seq ID NO: 42 Protein sequence: 
Protein Accession ff: NP_008883-1 



1 11 21 31 41 51 

I I i i i i 

HAKDNSTVRC FQGIiLIFGNV IIGCCGIALT AECIFFVSDQ HSIiYPLLBAT CNDDIYGAAW 60 

IGIFVGICIiP CLSVLGIVGI MKSSRKILLA YPILMPIVYA FEVASCITAA TQRDFFTPNL 120 

FIiKQMLERYQ NNSPPNNDDQ WKNNGVTKTW ORLMLQDNCC GVM6PSDHQK YTSAFRTEIIN 180 

DADYPWPRQC CVMNNLKBPI* MUSACKLGVP GPYHNQGCYE LISGPMMRHA WGVAWFGPAI 240 
LCHTFWVLLG TOFYWSRIEY 



Seq ID NO: 43 DNA sequence 

Nucleic Acid Accession #: Eoa sequence 

Coding sequence: 83 -260 S 

1 11 21 31 41 51 

1 I I I i i 

GCCXK5ACAGA TCTGCGOGTA TCCTGGAGCC GGCCCAGTTG TGAACTAGGA GAGCTTTGjSG 60 

ACCTCTGTCC CAAGCAAGAG AGATGAATGG AGAGTATAGA GGCA6AGGAT TTGGACXJAGG 120 

AAGATTTCAA AGCTGGAAAA GGGGAAGAGG TGGTGGGAAC TTCTCAGGAA AATGGAGAGA 180 

AAGAGAACAC AGACCTGATC TGAGTAAAAC CACAGGAAAA CGTACTTCTG AACAAACCCC 240 

ACAGTTTTTG CTTTCAACAA AGACCCCACA GTCAATGCAG TCAACATTGG ATC GATT CAT 300 

ACCATATAAA GGCTGGAAGC TTTATTTCTC TGAAGTTTAC AGOGATAGCT CTCCTTTGAT 360 

TGAGAAGATT CAAGCATTTG AAAAATTTTT CACAAGGCAT ATTGATTTGT ATGACAAGGA 420 

TGAAATAGAA A6AAAGGGAA GTATTTTGGT AGATTTTAAA GAACTGACAG AAGGTGGTGA 480 

AGTAACTAAC TTGATACCAG ATATAGCAAC TGAACTAAGA GATGCACCTG AGAAAACCTT 540 

GGCTTCCATG GGTTTGGCAA TACATCAGGT GTTAACTAAG GACCTTGAAA GGCATGCAGC 600 

TGAGTTACAA GCCCAGGAAG GATTGTCTAA TGATGGAGAA ACAATGGTAA ATGTGCXACA 660 

TATTCATGCA AGGGTGTACA ACTATGAGCC TTTGACACAG CTCAA6AATG TCAGAGCAAA 720 

TTACTATGGA AAATACATTG CTCTAAGAGG GACAGTGGTT CGTGTCAGTA ATATAAAGCC 780 

TCTTTGCACC AAGATG6CTT TTCTTT6TGC TGCATGTGGA GAAATTCAGA GCTTTCCTCT 840 

TCCAGATGGA AAATACAGTC TTCCCACAAA GTGTCCTGTG CCTOrGTGTC GAGGCAGGTC 900 

ATTTACTCCT CTCCGCAGCT CTCCTCTCAC AGTTACGATG GACTGGCAGT CAATCAAAAT 960 

CCAGGAATTG ATGTCTGATG ATCAGAGAGA AGCAGGTOGG ATTCCACGAA CAATAGAATG 1020 

TGAGCTTGTT CATGATCTTG TGGATAGCTG TGTCCCGGGA GACACAGTGA CTATTACTGG 1080 

AATTCTCAAA GTCTCAAATG CGGAAGAAGG TTCTOGAAAT AAGAATCACA AGTGTATGTT 1140 

CCTTTTGTAT ATTGAAGCAA ATTCTATTAG TAATAGCAAA GGACAGAAAA CAAAGAGTTC 1200 

TGAGGATGGG TGTAAGCATG GAATGTTGAT GGAGTTCTCA CTTAAAGACC TTTATGCCAT 1260 

CCAAGAGATT CAAGCTGAAG AAAACCTGTT TAAACTCATT GTCAACTCGC TTTGCCCrGT 1320 

CATTTTTGGT CATGAACTTG TTAAAGCAGG TTTGGCATTA GCACTCTTTG GAGGAAGCCA 1380 

GAAATAOGCA GATGACAAAA ACAGAATTCC AATTCXX5GGA GACCCCCACA TCCTTGTTGT 1440 

TOGA6ATCCA GGCCTAGGAA AAAGTCAAAT GCTACAGGCA GCGTGCAATG TT GCCC CACG ISOO 

TGGOGTGTAT GT TT GTGG TA ACACCACGAC CACXTCTGGT CTGACGGTAA CTCTTTCAAA 1560 

AGATAGTTCC TCTGGAGATT TTGCTTTGGA AGCTGGTGCC CTGGTACrTG GTGATCAAGG 1620 

TAtTTGTGGA ATCXiATGAAT TTGATAAGAT GGGGAATCAA CATCAA6CCT TGTTGGAAGC 1680 

CATGGAGCAG CAAAGTATTA GTCTTGCTAA GGCTGGTGTG GTTTGTAGCC TTCCTGCAAG 1740 

AACTTCXZATT ATTGCTGCTG CAAATCCAGT TGGAGGACAT TACAATAAAG CCAAAACAGT 1800 

TTCTGAGAAT TTAAAAATGG GGAGTGCACT ACTATCCAGA TTTGATTTGG TCTTTATCCT 1860 

GTTAGATACT CCAAATGAGC ATCATGATCA CTTACTCTCT GAACATGTGA TTGCAATAAG 1920 

AGCTGGAAAG CAGAGAACCA TTAGCAGTGC CACAGTAGCT CGTATGAATA GTCAAGATTC 1980 

AAATACTTCC GTACTTGAAG TAGTTTCTGA GAAGCCATTA TCAGAAA6AC TAAAGGTGGT 2040 

TCCTGGAGAA ACAATAGATC CCATTCCCCA CCAGCTATTG AGAAAGTACA TTGGCTATGC 2100 

TCGGCAGTAT GTGTACCCAA GGCTATCCAC AGAAGCTGCT CGAGTTCTTC AAGATTTTTA 2160 

CCTTGAGCTC OGGAAACAGA GCCAGAGGTT AAATAGCTCA CCAATCACTA CCAGGCAGCT 2220 

GGAATCTTTG ATTCGTCTGA CAGAGGCACG AGCAAGGTTG GAATTGAGAG AGGAAGCAAC 2280 

CAAAGAACAC GCTGAGGATA TAGTGGAAAT TATGAAATAT AGCATGCTAG GAACTTACTC 2340 

TGATGAATTT GGGAACCTAG ATTTTGAGCG ATCCCAGCAT GGTTCTGGAA TGACCAACAG 2400 

GTCAACAGCG AAAAGATTTA TTTCTGCTCT CSU^CAAGGTT GCTGAAAGAA CTTATAATAA 2460 

TATATTTCAA TTTCATCAAC TTCGGCAGAT TGCCAAAGAA CTAAACATTC AGGTTGCTGA 2520 

TTTTGAAAAT TTTATTGGAT CACTAAATGA CCAGGGTTAC CTCTTGAAAA AAGGCCCAAA 2580 

AGTTTACCAG CTTCAAACTA TGTAAAAGGA CTTCACCAAG TTAGGGCCTC CTGGGTTTAT 2640 

TGCAGATTAA AGCCATCTCA GTGAAGATAT GCGTGCAOGC ACAGACAGAC AGACACACAC 2700 

ACACACACAC ACACACACAC AGACACACAC ACACACAGTC AAATACTGTT CTCTGAAAAA 2760 

TGATGTCCCA AAAGTATTAT AATAGGAAAA AAGCATTAAA TATAATAAAC TAATTTAAGA 2820 



205 



wo 02/086443 

AGTIGATAAAG TCTCCAGATG CAGTAGCTCA CACTGTAAax: ACRGTGACTC AGCSAGGCTGA 2880 

GGTGAGAGGA TTCCTTGAGG 0CA£3GGTTOG AGACCRACCT TGGGCAACAT AGCftAGACCC 2940 

CATTTCTTAA. AAAAAAAAAA AAAAAATTTA AACTTAGCTG GGTATGGTGG CACATGOCTA 3000 

TAGTCTCAGC TACTTGTGftG GCT6AGGCAG GAGGATTCTT TGAGCOCAGG AGTTTGAGGT 3060 

TACAGTCAGC CACAATCACA CCAATCACTG CACTCCAGCC TGGGCAATAA AGTAACTCTT 3120 

GACrCAAAAA AATAAAAAAA AITGTAGTGG TAGCCATGTG TTAATTGTTA AATAAATTCT 3180 

OCAAAGGGCT AAAAGTAAAT TACTTATAAA TTTTTTATAG TTGTATTTTT GACCTGCCTT 3240 

TTATATGTAT GAATATTTCA TAGTTTTGCA TATCAGATGT AGGCATACAG ACAAATACAT 3300 

AAACCAATGA ATATATTACA TATTCTGTGT TCCAATAAAA CTT TATT TAT GGACACTAA A 3360 

ATTTCAATTT CATAAAATTT TOCCATGTCA AGAATACAAA ATACTTGAGT TTTGTTTTTA 3420 

GCTATTTAAT AATAGGTCTC ATTTATTCCA CAGGCTGTAG TTTGTAGTCT TGCTTGAAAC 3480 

AATAGAAACA GACTGATTAA GCAGGAGAAG TTTTTTGAAA GAATTTTGTT TGGCTCACGG 3S40 

AATTATTAGA AGGCAGGTGA ACCAGGAGGG TAAGCTTCCA GCAGCAATTT GTAAAACCAT 3600 

GCCTTAGAAT TCGACTAAGG AAGAAGCTGC TGACACTCCA CTGCCACACA GGGCACTGGA 3660 

AGAAAGTCCT GCTGCCTCCC TCCCCXavCCT TTGCCACTTC TGCAGCACGA ATAG GT^ AA 3720 

GAATGCCCCC ACCCX3CACCG GAACAGCAAC AAAAGGATTC TGCATGAGAT GCCTCCCTAA 3730 

ATTGCTtSAAT TCAAAAAAGA AGTTGCATAC AAAGAC3VTCT GATTGAAAAA GGGTATGTTA 3840 

TATCCCCCTT TCATAGGCTG CTAGGGAGTT TTCCTGGTTC TACTTTCAGG TGGTGGGATC 39O0 

AATAAGAOCA GAATTTCTCA TATGTTGTGA GAGGATTCAA ATGTTACAGG GTTGCCAGCC 3960 

AAACTATCAA TCATGTATAA ATCCAACAAA CACTTTGTAA CATACAAGAA CTCACGAAAT 4020 

GTCAACCATT GTTGGAGAAT CTACTAAAAT AOGGCTTCCC GCAAAOGAAG ATGAATGGAA 4080 

AATGTAAATA AAAAGAACTG GCAGTGTATA TCAGATGTTT AACTATAGGA OTMSAACTAA 4140 

GATGTCGAGA CTATTGCCAT AGACCACAAT GTAAATTTTT AAGTGAG6AA GGAAAAATCA 4200 

GGAATCAAAA GGGGCCAGGT GCAGTGGCTC ACATCTATAA TCCCAGAGCT TTGGGAGTTC 4260 

GAGGCAGGAG GATCACTTGA AGCCAGTTTT GAGAOCAGCC TATGCAACAC ATTGAGACCC 4320 

TATCTCTACA AAAAATAGAT TAGCTGGGCA CGGTGGTCCA TGCCTATTGT CCTACCTACT 4380 

GTGGAGGCTG AA6TAGGAAA TCACTTGAGC COGAGAGTTT GAGGTTACSVG TGAGCTATGA 4440 
TTATACCACT GCACTCCAGC CTGG6CAAGA GAGCAAGACC TTGTCTCTT 



Seq ID NO: 44 Protein sequence; 
Protein Accession U-. CAB55276.2 

1 11 21 31 41 51 

I ] i I I I 

MNGEYRGRGF GRGRPQSWKR GRGGCaiFSGX HREREHRPDL SKrTGKRTSS QTPQFLLSTK 60 

TPQSMQSTLD RFIPYKGMKL YPSEVYSDSS PLIEKIQAPS KPFTRHIDLY DKDEIERKGS 120 

ILVDFKELTE GGEVTNLIPD lATELRDAPE KTLACKGIAI HQVLTKDLER HAAEIXJAQEG 180 

LSNDGETMVN VPHIHARVYN YEPLTQLKNV RANYYGKYIA LRGTWRVSN IKPLCTKMAF 240 

LCAACGEIQS FPLPDGKYSL PTKCPVPVCR GRSFTALRSS PLTVTKDWQS IKIQELMSDD 300 

QREAGRIPHT lECELVHDLV DSCVP<a>TVT ITGIVKVSNA EEGSRNKNDK CMPLLYIEAK 360 

SISNSKGQKT KSSEOGCKHG MLMEPSLKDL YAIQEIQAEE NLFKLIVMSL CPVIFGHELV 420 

KAGLALALFG GSQKYADDKK RIPIRGDPHI LWGDPGLGK SOMLQAACNV APRGVYVCGN 480 

TTTTSGLTVT LSKDSSSGDP ALEAGALVLG DQGICX3IDSF DKMGNQHQAL LEAMECXISIS S40 

LAKAGWCSL PARTSIIAAA NPVGGHYNKA KTVSQJLKMG SALLSRPDLV FILLDTPNEH 600 

HDHLLSEHVI AIRAGKQRTI SSATVARMNS QDSNTSVLEV VSEKPLSERL KWPGETIDP 660 

IPHQLLRKYI GYARQYVYPR LSTEAARVLQ DFYLELRKQS QRLNSSPITT RQLESLIRLT 720 
EARARLBLRE EATKEDAEDI VEIMKYSMLG TYSDEFGMLD FBRSQHGSGM SNRSTAKRFI 
SALNNVAERT YNNIFQPHQL RQIAKELHIQ VADPaiPIGS IiNDQGYLIiKK GPKVYQLQTM 



780 



Seq ID NO: 45 DNA sequence 

Nucleic Acid Accession #: NM_005416.1 

Coding sequence: 149.. 658 

1 11 21 31 41 51 

1 I 1 I } i 

ACCAGATCCC AGAGGCTGAA CACCTCGACC TTCTCTGCAC AGCAGATGAT CCCTGAGCAG 60 

CTGAAGACCA GAAAAGCCAC TAAGACTTTC TGCTTAATTC AGGAGCTTAG AGGATTCTTC 120 

AAAGAGTGTG TCCACGATCC TTTGAAGCAT GAGTTCTTAC CAGCAGAAGC AGACCTTTAC 180 

CCCACCACCT CAGCTTCAAC AGCAfiCAGGT GAAACAACCC AGCCAGCCTC CACCTCAGGA 240 

AATATTTGTT CCCACAACCA AGGAGCCATG CCACTCAAAG GTTCCACAAC CTGGAAACAC 300 

AAAGATTCCA GAGCXawSGCT GTACCAAGGT CCCTGAGCCA GGCTGTACCA AGGTCCCTGA 360 

GCCAGGCTGT ACCAAGGTCC CTGAGCCAGG TTGTACCAAG GTCCCTGAGC CAGGCTGTAC 420 

CAAGGTCCCT GAGCCAGGTT GTACCAAGGT CCCTGAGCCA GGCTACACCA AGGTCCCTGA 480 

ACCAGGCAGC ATCAAGGTCC CTGACCAAGG CTTCATCAAG TTTCCTGAGC CAGGTGCCAT 540 

CAAAGTTCCT GAGCAAGGAT ACACCAAAGT TCCTGTGCCA GGCTACACAA AGCTACCAGA 600 

GCCATGTCCT TCAACGGTCA CTCCAGGCCC AGCTCAGCAG AAGACCAAGC AGAAGTAATT 660 

TGGTGCACAG ACAAGCCCTT GAGAAGCCAA CCACCAGATG CTGQACACCC TCTTCCCATC 720 

TGTTTCTGTG TCTTAATTGT CTGTAGACCT TGTAATCAGC ACATTGTCAC CCCAAGCCAT 780 

AGTCTCTCrC TTATTTGTAT CCTAAAAATA CGTACTATAA AGCTTTTGTT CACACACACT 840 

CTGAAGAATC CTGTAAGCCC CTGAATTAAG CAGAAAGTCT TCATGGCTTT TCTGGTCTTC 900 

GGCTGCTCAG GGTTCATCTG AAGATTCGAA TGAAAAGAAA TGCATGTTTC CTGCTCTTOC 960 
CTCATTAAAT TGCTTTTAAT TCCA 



Seq ID NO: 46 Protein sequence: 
Protein Accession KP_0 05407.1 

1 11 21 31 41 SI 

i I 1 I I 1 

MSSYQQKQTF TPPPQLQQQQ VKQPSQPPPQ BIFVPTTXEP CHSRyPQPGN TKIPEPGCTK 60 
VPEPGCTKVP EPGCTfCVPBP GCTKVPEPGC TiCVPEPGCTK VPEPGYTKVP EPGSIKVPDQ 120 

gfikfpepga ikvpeqgytk vpvpgytklp epcpstvtpg paqqktkqr 

Seq ID NO: 47 DNA sequence 

Nucleic Acid Accession #: Eos sequence 
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1 11 21 31 

I I I I 

GOGTGCTGTG CflGGOGTCCC GGOGCTGTGG ATAATTAGAC 
AAGGCTCGTT AGAATTCGCC CTAGAGCTGT ATCATGTATT 
TTGCAATTAA GCTTAGGQAA CCAGCAACAA AAGGAAACTT 
GAAAATGGAT TAGAGAAACT TCTTCCXXXSA TTTAAGGGGA 
TTTGGGGAAA GTGCCOOGAC OGCAGAfiGOG AOGACAGGGG 
AGTCGGOGTT GGGG6C3UG06 GTCGCCTTCC TCATCTGGGC 
TAAGGATAAC ATCCTGGAAA, TGACTTCTGT AOGGTTTGAG 
TTGGAGCTGC CCIGTGGAGT TACAGTTXAC CAAA CRCATT 
CTAAAAACTT TGTGAGAATT TTCTTTTACT AAAATTTTtT 

Seq ZD NO: 4B SNA sequence: 

Hucleic Acid Accession ft: CAT cluster 



PCT/US02/12476 



TTCCAAATTT 
TTTTAGTAAA 
CTCCAAGTCA 
TCCTTACTCT 
COSACTACCG 
CCCAAAGOGC 
ATTTTOGCGG 
TT6CAAGCAA 
AGCCTTG6GC 
CGACGCT 



11 
I 

TTTTTTTTGT 
TGAGATTATG 
TSAGTGTGCA 
TCTCGCSAGCX: 
TGAGCAGCTT 
TGGCCGCAGG 
TGAACGACCT 
AOTTAATTTG 
AATGAGGGAA 



21 
i 

AATAAGAAAA 
TTCATGAATG 
GTTGQGCTCA 
CACATCGCCC 
CCTGCTCCCC 
AATCTTTCCC 
CGGGCCAAGT 
AAAGAAAATA 
GAACGTGTCT 



31 

! 

AATTTTAGTA 
TGTTTGGTAA 
AACGGTACAG 
AGATGAGGAA 
TGTCGTCGCC 
CTTAAATCGG 
TTGCTTTTGT 
CATGATACAG 
AGTTATCCAC 



41 
I 

AOGTTCTTCC 
TTCTTTCAAA 
GGCCCGAGGT 
AAGATTCCTG 
AGCAGGAAGC 
GATGTGGGCT 
CCCAACTGCA 
CATGAACATA 
CTTATTACAA 



41 

1 

AAAGAAAATT 
ACTGTAACTC 
AAGICATTTC 
OGGCACOGCT 
TCTGCGGTCG 
GGAAGAAGTT 
TGCTGGTTCC 
CTCTAGGGOG 
AGCCCGGGGA 



51 
I 

CTCATTGCOC 
TTAACTTT6C 

cxrrrcACOGC 

OGGCCAGCGC 
TGCTCACGGT 
CCTAGAAGAG 
CACTCATGAC 
ATCTCATTTA 
A 



SI 
I 

CTCACAAAGT 
CACA6C3GCAG 
CAGGATGTTA 
GCCGCCAACG 
GGGCACTTTC 
TCTCTAATCC 
CTAAGCTTAA 
AATTCTAA06 
CGCCTGCACA 



Seq ID NO: 49 DNA sequence 

Nucleic Acid Accession 8: CAT cluster 



1 
I 

TCTTTCTTCT 
CCTGCCGACC 
ACCGTAGACC 
GACACATGCA 
CCCAACCAAA 
TTCATTTAAA 
CTTTCTCTGA 
TCGCTGTAGC 
TAAAGAAACT 
GATTGAACCA 
TGCTGGTATC 
GAG6TGGGGA 
CC6GGTCTCT 
GAGACAGCAT 
ACTG 



Seq ID KG: 50 DNA sequence 
Hucleic Acid Accession ft: L05187 
Coding sequence: 1991.. 2260 



1 
I 

CTGCAGGGAG 
TCAGAAAGGA 
CAGAAGAAGG 
TGAAGQAAAG 
AGAGTCATAA 
GGAAATGGAT 
ATTTCTAGCT 
CCCCTCCCTT 
ACAACCATCT 
CCAGGGTTAA 
CCCTGCACCT 
CAGCCAGCTA 
GATGAGGATG 
AGCTTCTATT 
TCACACCAAA 
ATTGCAACAA 
ATATGTGTAA 
TATTTTAAGT 
CCTCAGTAGA 
AGTTCATAGC 
TGACAAGATA 
ATTTAAGGCA 
AACATAAAAC 
AGTAATTGGC 
AGGAGACCTC 
AGATGGGAAG 
GAGGCTTAGA 
GAGGAAAGTG 
GAAGCCAGCT 
GAGCCAAGAA 
TTCAAAGGGC 



11 

1 

GCAGGTAGAA 
GGAAAAGGCC 
ATTAGCCCCT 
CAGGTTTTCC 
GTAAATTATT 
GGAAGGTCTT 

TccACcrrcA 

TCCCACCTAT 
CAATGACAAG 
CTCATGAAAC 
GGGTCTGAGG 
GTGOCAAAAA 
GTA6TGTGAG 
TCCTTGAGGC 
CCCAAGGGAC 
ACTGGCAATT 
GCAGGTTAAT 
TAAATTACAG 
TAGTCATTGA 
AGAACTAGAA 
TTTATAGAAA 
GTATGCTAGG 
CTAGCAGGAA 
ATGACGGAGA 
TAGGGTGTCA 
AAAAGCATTT 
TGAATATAAA 
GTCTGATGCC 
TTAGTAGGGC 
GAGAACTCCA 
CTGAAAATTA 



21 

1 

AAGGCTTTTG 
AGGGCAGATG 
GAAAGTCCCT 
CAGATTAGCA 
CTGAATGTGT 
GGACTCTGAG 
CCAAGGCAGA 
TCATGTGTGC 
GACAGCAGGT 
CCTCCATGAA 
ATGAGGGTGG 
ATATCAGGT6 
TCATGTGTGA 
AGGGCTCATT 
CACACAGCCC 
CTAGTGTACT 
CCAGGGTTTC 
TCTGGATTTG 
ACTGGGAGTC 
CTCAGGCCAG 
TTTTAATTTA 
CACTTTGGAC 
GGTAATACAT 
TGGGCAGAGA 
AGTGATGTGA 
GGAAGGGACT 
GCCATCCTAT 
ATTTTCCAAA 
ATTTTTCCAG 
ATAAAATGGA 
TCCAAGCTTA 



31 
1 

GGTTTTCAGG 
TCTGGGTGGA 
GAAGTAGGAG 
ACCAGTCA66 
GTAGTTTAAT 
ACAAGGGGTC 
CAAGGAGGGC 
AAGAGTGCCC 
GGCAAGGCTC 
GCCTGCTGCT 
CAGTGAAAAT 
GTGTTCATCA 
CAGGTGAGGA 
CATCTTATAA 
ATTCTGCTCC 
TTTTCATTAT 
AATG6GAGAT 
AAAGGACCTT 
CTGGAGAAGA 
AGCACTCTCA 
TTAGATGGAT 
AAATCAATGC 
ATATATAAAT 
AGGGCTGTGC 
GCTAIGATGG 
GTGTAAGCAC 
AAGTCACAGG 
AGACCTAATA 
AACAGA7ATA 
GCAGAAGAAA 
TTTCATTTTT 



41 

1 

TGGGGGGCAG 
GTGAAGGGAA 
AAGGGTAAAG 
GGGAGGAAQG 
GGAATTGGGA 
TATAATCAGT 
CCACCTCAGC 
TGTCCCACAG 
AACAGGACTC 
CACXXXrTCCC 
TAGGCCAGTG 
AATAA6CCGA 
ATGAAAACAG 
AAGCCAGCTG 
GTATACCAGG 
TAGAAATTAG 
AGAGAATAGT 
AGAGATGGTT 
TTGTTCAAAT 
GTAACACTGC 
CTCTACTGAG 
CCTAAOGTAC 
AAJTTGAAATG 
ACTTTTGGGA 
AGGQGTATtT 
AGACCAGAAG 
CTTTCTACAT 
TGOGGACCTC 
AGGTGCCTTG 
TTGCCTTTTA 
AAATGTAATG 



51 
1 

TCTAGCCTGA 
AAAGTGATCC 
GTGTGGTTGG 
TGAGAGTGGG 
AAAAGATGGG 
CCATTTCATT 
TCCTCTGCTC 
AACACGGGGA 
AGATGTCCCC 
TCAAGGCAAG 
ACATCATTTT 
GCCAACCGGT 
AGTGCCCGAG 
GCCATTGCCT 
TAAGTCTCTG 
CTAAAGGCAA 
G6AATATCTT 
AGGGCTCCCA 
GCCCATGGGA 
AATTTCCCCC 
CATTTATTCC 
TTACTTAACA 
CAAAGTAGAT 
GACTTGCTCA 
GGACAAGCAG 
CAAAACCATA 
GGTACTAGGA 
ATGTCCCTCA 
GGTAGGAAGG 

GGGQAGCCAA 



60 
120 
180 
240 
300 
360 
420 
4B0 



60 
120 
180 
240 
300 
360 
420 
480 
540 




60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
ISOO 
1560 
1620 
1680 
1740 
1800 
1860 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
SO 
85 



OOGAGATGAA 
TTGTATGCAT 
CATTTGAAGC 
GCAGCAC3CAG 
CAAGGAGCCC 
CCAGCCCAAG 
CACTCCAGCA 
TTGAGGAGCT 
GCCTATTGAC 
CTAAAAAGAT 
GTCTCACTGA 
AGGTCAAGTG 



AGGCTTTCTC 
CTTTCTTTAA 
ATGAATTCTC 
GTGAAACAAC 
TGCX3VACCCA 
ATTCCAGAGC 
CCAGCCCAGC 
GGCCACTGGA 
CCTGCAGTTA 
GTCCCTTACC 
CTGAGCTAGT 
ACCATCCCTA 



TTCTAAAGQG 
TTGAATCACT 
AGCAGGAGAA 
CTTGOCAGOC 
AGGTGOCTGA 
CCTGCCAGCC 
AGAA6A0CAA 
TACTGAAO^ 
GCATGCIGT C 
CTCATTCTGG 
CTTCTTGTTG 
G 



TCCTGAAATA 
GTGTCACCTT 
OCAGCCTTGC 
TCCAGOCCAG 
GCCCTGCCAC 
CAAGGTGCCT 
GCAiGAAGTAA 
CCTACTCCAT 
ACCCTGAATC 
AGGCTGCTGA 
CTOGGGTGCA 



AAATCTGTTT 
TCTGTCTCTA 
ACCOCACCCC 
GAACCATCCA 
CCCAAAGTGC 
GAGCCCTGCC 
TGTGGTCCAC 
TCTGCTTATG 
ATAATC3GCTC 
GCCTCTGOGT 
TTTGAGGATG 



GGCATTGAAT 
GAAAAAAACA 
C7C3UjOCTCA 

tcoooiaaac 
ctgagccctg 

CTTCAAOGGT 
AGCCATGCCC 
AATCCCATTT 
CTTTGCACCT 
AAGGCTGAAC 
GA7TTGGGGA 



1920 
1980 
2040 
2100 

2160 
2220 
2280 
2340 
2400 
2460 
2520 



Seq ID NOi 51 Protein se<iuence: 
Protein Accession tt:AAC26838 

11 21 



1 11 21 31 41 51 

I I ! 1 I 1 

MNSQOQKQPC TPPPOPQQOQ VKQPCQPPPQ EPCIPKTKEP CQPKVPEPCB PKVPEPOQPK 
IPEPOQPKVP EPCPSTVTPA PAQQKTKQK 



Seq ID NO: 52 DKA sequence 

Nucleic Acid Accession #: NM__002fi38.1 

Coding sequence: 120-473 



1 

I 

CAATACAGCT 
GCTGGACTGC 
TGAGGGCCAG 
AGGCAGCTGT 
TCAATGGACA 
OGCAAGAGCC 
TCCGGTGCGC 
TCAAGAAGTG 
CGGTCCTTGC 
TGCTGCCCTT 
GAGCT6CCTC 



11 
I 

AAGGAATTAT 
ATAAA6ATTG 

CAC5CTTCTTG 
CACGGGAGTT 
AGATCCCGTT 
AGTCAAAGGT 
CATGTTGAAT 
CTGTGAAGGC 
TGCACCTGTG 
CCCCTTCCCA 
TCTCATCCAC 



21 
I 

CCCTTGTAAA 
GTATGGCXTTT 
ATCGTGGTGG 
CCTGTTAAAG 
AAAGGACAAG 
CCAGTCTCCA 
CCCCCTAACC 
TCTTGCGGGA 
CCGTCCCCAG 
CACTGTCCAT 
TTTCCAATAA 



31 

1 

TAGCACAGAC 
AGCTCTTAGC 
TGTTCCTCAT 
GTCAAjGACAC 
TTTCAGTTAA 
CTAAGCCTGG 
GCTGCTTGAA 
TGGCCTGTTT 
AGCTACAGGC 
TCTTCCTCCC 
A 



41 
I 

C06CCCTGGA 

CAAACftccrr 

OCCTGGGACG 
•PGTCAAAGGC 
AGGTCAAGAT 
CTCCTGCCCC 
AGATACTGAC 
OGTTCCCCAG 
CCCa^TCTGGT 
ATTCAGGATG 



51 
1 

GCCAGGCCAA 
CCTGACACCA 
CTGGTTCTAG 
OGTGTTCCAT 

aaagtcaaag 
attatcttga 
tgc!Cx:aggaa 
tgaagggagc 

CCTAAGTCCC 
CCCAGGGCTG 



Seq ID HO: S3 Protein sequence: 
Protein Accession NP_002629.l 



31 



41 



51 



1 11 21 

I i I i I 1 

MRASSFLIW VFLIAGTLVXi EyUlVTGVPVK GQDTVKGRVP PNGQDPVKGQ VSVKGQDKVK 
AQEPVKGPVS TKPGSCPIIL IRCAMLNPPN HCLKDTDCPO IKKCCEGSCG MACFVPQ 

Seq ID NO: 54 DNA sequence 
Nucleic Acid Accession U : NM_019618 
Coding sequence: 75-584 



GGCAGGAGCC 
GAGACAACCA 
ATCAATCAAT 
CCCTTCAGGG 
TTGCTGTTAT 
ATTTGGGAAT 
CATTGCAGCT 
CCTTCCTTTT 
CGGACTGGTT 
GGAAGTCATA 
GCAGCTTGGT 
AGTGTCATTT 
TAATGAAGAA 
GGAGAGCTGG 
CTGCATGAGT 
TGAAGATGCT 
CTCTGTTTCT 
CCAATATACC 
TAATTCTTGT 
AATAAACTTT 



11 

I 

AGGATTCAGT 
CACTATGAGA 
GTGTAAACCT 
TCAGAACCTT 
CACATGCAAG 
CCAGAATCCA 
AAAAGAGCAG 
CTACCGTGCC 
CATTGCCTCC 
CAACACTGCC 
CTTTGTCTTA 
TCACGCTGGT 
GAAGCAATTA 
GTGGTATAAG 
GACTTTAAGA 
TCAGAGCTCA 
GTTTTGCTTT 
TCATTGTGTG 
GTTAAGTTAA 
GTGTATTTAT 



21 
I 

CCCCTGGACT 
GGCACTCCAG 
ATTACTGGGA 
GTGGCAGTTC 
TATCCAGAGG 
GAAATGTGTT 
AAGATCATGG 
AAGACTGGTA 
TCCAAGAGAG 
TTTGAATTAA 
AAGTTTCTGG 
GCTGAGACAG 
CTTCATAGCA 
GCT6TCCTCT 
CTCAAAGACC 
TGCGCGTTAC 
ATTCCCTCTT 
TAATAGAACC 
ATCATTTTT6 
ATAATAAAAA 



31 
I 

GTAGATAAAG 
GAGACGCTGA 
CTATTAATGA 
CACGAAGTGA 
CTCTTGAGCA 
TGTATTGTGA 
ATCTGTATGG 
GGACCrcCAC 
ACCAGCCCAT 
ATATAAATGA 
TTCCCAATGT 
GGGCAAGGCT 
ACTGAAGAAC 
CAAGCTQGTG 
AAACACTGAG 
CCACGATGGC 
GGGATGATAT 
TTCTTAGCAT 
TOCTAATTGT 
AAAAAAAAAA 



41 



ACCCTTTCTT 
TGGTGGAGGA 
TTTGAATCAG 
CAGTGTGACC 
AGGCAGAGGG 
GAAGGTTGGA 
CCAACCCGAG 
CCTTGAGTCT 
CATTCTGACT 
CTGAACTCAG 
GTTTTCGTCT 
GCTGTTATCA 
AGGATGTGGC 
CTGTGTAGGC 
CTTTCTTCTA 
ATGACTAGCA 
CATCCAGTCT 
TAAGACCTTG 
AATGTGTAAT 
AAA 



51 

I 

GCCAGGT6CT 
AGGGCCGTCT 
CAAGTGTGGA 
CCAGTCACTG 
GATCCCATTT 
GAACAGCCCA 
CCCGTGAAAC 
GTGGCCTTCC 
TCAGAACTT6 
CCTAGAGGTG 
ACATTTTCTT 
TCTCATTTTA 
CTCAGAA6CA 
CACAAG6CAT 
GGGGTGGGTA 
CAGAGCTGAT 
TTATATGTTG 
TAAACAAAAA 
CTTAAAGTTA 



Seq ID NO: 55 Protein sequence: 
Protein Accession {^i NP_062564 



31 



51 



60 
120 

180 
240 
300 
360 
420 
480 
540 
600 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



1 11 21 

I I I 1 

MRGTPGDADG GGRAVYOSMC RPITGTXNDL NQQVWTLQGQ NLVAVPRSDS VTPVTVAVIT 
CKYPEALEQG RGDPIYLGIQ NPSKCLYCSK VGEQPTLQIiK EQKIMDIiYGQ PEPVKPFliFy 
RAKTGRTSTL ESVAFPDWFI ASSKROQPII LTSELGKSYN TAPELHIND 

Seq ID NO: 56 DMA sequence 
Nucleic Acid Accession ft: NM_003125 
Coding sequence: 65-334 



60 
120 



208 



wo 02/086443 



PCT/US02/12476 



10 



15 



1 
I 

AGCAGTTCTA 
CAGCATGAGT 
GCAGGTGAAA 
GCCCTGCCAC 
CAAGCTTCCA 
AGCACCAGCC 
AGCOGGCCAC 
CAATTAGCAT 
TCTGAGTCTC 
ATTCATCTGA 
AAATTCACTT 



11 
I 

AGGGACCATA 
TCCCAGCAGC 
CAGCCTTGCC 
COCAAGGTGC 
GAGGCATGGC 
CAGCAGAAGA 
CAGATGCTGA 
TCTGTCTCCC 
TGAATGAAGC 
AGAGAGACTT 
TCAATTCCA 



21 
I 

CAGAGTATTC 
AGAACCAGCC 
AGCCTCCACC 
CTGAGCCCTG 
ACCCCUGGT 
CCAAGCAGAA 
ATCCCCTATC 
CCAAAAAAGA 
TGAAGGTCTT 
AAGATGAAAG 



31 
I 

CPCTCTTCAC 
CTGCATCCCA 
TCAGGAACCA 
CXZACCCCAAA 
GOCTGAGCCC 
GTAATGTOGT 
CCATTCTGTG 
ATGTGCTATG 
AGTACCAGAG 
CAAATGATTC 



41 

I . 

ACCAG6ACCA 
CCCCCrCAGC 
TGCATCCCCA 
GTGCCTGAGC 
T6CCCTTCAA 
CCACAGCCAT 
TATGAGTCCC 
AAGCTTTCTT 
CTAGTTTTCA 
AGCTCCCTTA 



SI 
I 

GCCACTUTTG 
TTCAGCAGCA 
AAACCRAGGA 
CCTGCCAGOC 
TAGTChCTCC 
GCCCTTGAGG 
ATTTGOCTTG 
TCCTACACAC 
GCTGCTCAGA 

TAOXCCArr 



60 

120 
180 
240 
300 
360 
420 
480 
540 
600 



20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID KO: S7 Protein seoueace: 
Protein Accession I: KP_003ll« 

1 11 21 31 fl fl 

I I I I I I 

HSSQQQKQPC IPPPQbQQQQ VKQPCQPPPO BPClPKTItBP CHPKWPBPOJ KWPEPOQPK 

IiPEPCHPKVP KPCPSIVTPA PAQQRTICQK 

Seq ID M0> 58 om sequence 

Hueleie Acid Accession 8< tm_001793.3 

Coding sequence: 71-2S60 



1 
I 

AAAGGGGCAA 
CrCPGCAGCC 
CTGGCTGCAG 
CTTGGAGGCG 
CTGCCCTGGG 
TGGCGAGACA 
ATCCAAACX3T 
TGAAAATGGC 
AGACACCAAG 
CTTCGCTGTA 
GATTGCCAAG 
CCCCATGAAC 
GGACACCTTC 
GACAGCCACG 
CCATAGCCAA 
CACCArrCAGC 
CATCCAGGCC 
GATCCTTGAT 
GCCTGAGAAT 
CAACTCACCA 
TACCATCACC 
TTTTGAG6CC 
GCTGAAGCTC 
ACCTGTGTTT 
GCCTGTGTGT 
CATCCTGAGA 
TGTGGGCACC 
GGTCTTGGCC 
ACTGATTGAT 
CCAAAGCCCT 
CCCTTTCCAG 
GGAAGGTGAC 
GCAOCTTTCT 
GTGCGACTGC 
CCCTGTGCTG 
GAGAAAGAAG 
CGTCTTCTAC 
GCTCCACCGA 
CATCATCCCG 
TATAATTGAG 
CTTGGTGTTC 
CTCCGCCTOC 
QAA6CTGGCA 
GGGACCAAAC 
GACTTCGGAG 
AOGTTAGAGT 
AGCA CTSftAA 
TCTTACCTGC 
TACAGTGGAC 
TTTTTTTAAT 
GCTGGGOCCA 
TGGATCTCTG 
GTTGOGTTGC 
TAAAGAAACT 



11 
1 

GAGCTGAGOG 
ATG6GGCTCC 
TGCG0GGC3CT 
GGAGGCGCXK3 
CAAGAGCCAG 
GTCCAGGAAA 
ATCTTAOGAA 
AAG6GTC0CT 
ATTTTCTACA 
GAGAAGGAGA 
TATGAGCTCT 
ATCTCXSVTCA 
CGAGGGAGTG 
GATGAGGATG 
GAACCAAAGG 
GTCATCTCCA 
ACAGACATGG 
GCCAATGACA 
GCAGTGGGCC 
G0GTGGG6TG 
AGCXAOOCTG 

aaaaaccagc 
ccaacctcca 
gtcccaccct 
gtctacactg 
gacxx:agcag 
ctogaccgtg 

ATGGACAATG 
GTCAATGACC 
GTGCGCCAGG 
GCCCAGCTCA 
ACAGTGGTCT 
CTGTCTGACC 
CATGGCCATG 

ggogctgtcx: 

CGGAAGATCA 
TATGGCGAAG 
GGTCTGGAGG 
ACACCCATGT 
AACCTGAAGG 
GACTATGAGG 
GACCAAGACC 
GACATGTACG 
GTCAGGCCAC 
CTTGTCAGGA 
GGTTGCTTCC 
ACCTCTCCAC 
CGTAAAATGC 
TTTCTCTCTG 
GCTATCTTCA 
CTGGCCGTCC 
CGTTTTTATA 
TATAGATGAA 
TTTCCCAGAA 



21 
I 

GAACACCGGC 
CTC3GTGGACC 
CCGAGCCGTG 
AGCAGGAGCC 
CTCTGTTTAG 
GAAGGTCACT 
GACACAAGAG 
TCCCCCAGAG 
GCATCACGGG 
CAGGCTGGTT 
•rTGGCCACGC 
TCGTGACCGA 
TCTTAGAGGG 
ATGCCATCTA 
ACCCACACGA 
GTGGCCTGGA 
ATGGGGACGG 
ATGCTCCCAT 
ATGAGGTGCA 
CCACCTACCT 
AGAGCAACCA 
ACACCCTGTA 
CAGCCACCAT 
CCAAAGTCGT 
CAGAAGACCC 
GGTGGCTAGC 
AGGATGAGCA 
GAAGCCCTCC 
ATGGCCCAGT 
TGCTGAACAT 
CAGATGACTC 
TGTCCCTGAA 
ATGGCAACAA 
TCGAAACCTG 
TGGCTCIGCT 
AGGAGCCCCT 
AGGGGGGTGG 
CCAGGCOGGA 
ACCX3TCCTCX3 
CGGCTAACyVC 
GCAGQGGCTC 
AAGATTAGGA 
GTGG0GGG6A 
AGAGCATCTC 
AGTGGCCGTA 
TTAGCCTTTC 

ctggggcagg 
tcaaccctgt 
gaatggaacc 
aaacgttaga 
tgcatttctc 
ctgagtgtgc 
gggtgaggac 

AAAAA 



31 

I 

CCGOOGTOGC 
TCTCGCGTCT 
OCGGGOGGTC 
CGGCCAGGCG 
CACTGATAAT 
GAAGGAAAGG 
AGATTGGGTG 
ACTGAATCAG 
GCCGGGGGCA 
GTTGTTGAAT 
TGTGTCAGAG 
CCAGAATGAC 
AGTCCTACCA 
CACCTACAAT 
OCTCaTGTTC 
OOOGGAAAAA 
CTOCACCACC 
GTTTGACCCC 
GAGGCTGACG 
TATCATGGGC 
GGGCATCCTG 
CGTTGAAGTG 
AGTGGTCCAC 
TGAGGTCCAG 
TGACAAGGAG 
CATGGACCCA 
GTTTGTGAGG 
CACCACTGGC 
OCCTGAGCCC 
CACX3GACAAG 
AGACATCTAC 
GAAGTTCCTG 
AGAGCAGCTG 
CCCTG6ACCC 
GTTCCTCCTG 
CCTACTCCCA 
CGAAGAGOAC 
GGTGGTTCTC 
GCCAGCCAAC 
AGACCCCACA 
06ACGCCGGG 
TTATCTGAAC 
GGAOGACTAG 
CAAGGGGTCT 
GCAACTTGGC 
AGGATGGAGG 
GTTGCCTCAG 
GTCCTGGGCC 
TTCTTAGGCC 
GAAAGTTCTT 
GTTTCCAGAC 
CTAGGTTGGC 
AATCGTGTAT 



41 

1 

GGCAGCTGCT 
CTCCTCCTTC 
TTCAGGGAGG 
CTGGGGAAAG 
QATCACTTCA 
AATCCATTGA 
GTTGCTCCAA 
CTCAAGTCTA 
GACAGCCCCC 
AAGCCACTGG 
AATGGTGCCT 
CACAAGCCCA 
GGTACTTCTG 
GGGGTGGTTG 
ACCATTCACC 
GTCCCTGAGT 
ACGGCAGTGG 
CAGAAGTACG 
GTCACTGATC 
GGTGACGAOG 
ACAACCAGGA 
ACCAACGAGG 
GTGGAGGA7G 
GAGGGCATCC 
AATCAAAAGA 
GACAGTGGGC 
AACAACATCT 
ACGGGAACCC 
CGTCAGATCA 
GACCTGTCTC 
TGGAOGGCAG 
AAGCAGGATA 
ACGGTGATCA 
TGGAAGGGAG 
CTGGTGCTGC 
6AAGATQACA 
CAGGACTATS 
G6CAAT6AGG 
CCAGATGAAA 
GCCCCGCCCT 
TCCCTGACCT 
GAGTGGGGCA 
GC6GCCTQCC 
CAGTTCCCCC 
GGAGACAGGC 
AATGTGGGCA 
AGGCCAAGTT 
TGGGCCTGCT 
TCCTGGTGCA 
CAAAAGTGCA 
COCAATGOCT 
CCTTATTTTT 
ATGTACTAGA 



SI 
I 

TCACCCCTCT 
TCCAGGTTTG 
CTGAA6TGAC 
TATTCATGGG 
CTGTGCaSGAA 
AGATCTTCCC 
TATCTGTCCC 
ATAAAGATA6 
CT6AGGGTGT 
ACOGGGAGGA 
CAGTG6AGGA 
A6TTTAC0CA 
TGATGCAGGT 
CTTACTCCAT 
GGAGCACAGG 
ACACACTGAC 
GAGTAGTGGA 
AGGOCCATGT 
TGGACGCCCC 
GGGACX3VTTT 
AGGGTTTGGA 
CCCCTTTTGT 
TGAATGAG6C 
CCACTGGGGA 
TCAGCTAC06 
AGGTCACAGC 
ATGAAGTCAT 
TTCTGCTAAC 
GCATCTGCAA 
CCCACACCTC 
AGGTCAACGA 
CATATGACGT 
GGGCCACTGT 
GTTTCATCCT 
TTTTGTTGGT 
CC0GT6ACAA 
ACATCAOCCA 
TG6CACCAAC 
TCX3GCAACTT 
AOGACACCCT 
OCCTCACCTC 
GCC6CTTCAA 
TGCAGG6CTG 
TTCAGCTGAG 
TATGAGTCTG 
GTTTGACTTC 
TCCAGAAGCC 
GTGACTGACC 
ACTTAATTTT 
GCCCAGA6CT 
GCCATTOQGA 
TATTTTCCCT 
ACTTTTTTAT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1360 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 



Seq ID 1 



59 Protein sequence: 



209 



wo 02/086443 
Protein Accession S: 



PCT/US02/12476 



N9 001784.2 



10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 

85 



1 

I 

NQLPSGPUVS 
QEPALPSTDM 
KGPFPQRU^ 
YELFCTAVSE 
DEDDAIYTYN 
TDKDGDGSTT 
AWRATYIiIKG 
PTSTATIWH 
DPAGHLAKDP 



TWLSLKKFL 
GAVLALLFLL 
GLBARPEWL 
DYBGS6SDAA 



11 
I 

LLLLQVCWLQ 
DDFTVBNGET 
LRStlXDSDTR 
KGASVEDPMK 
GWAYSIHSQ 
TAVAVVEIU3 
GDDGDHPTIT 
VEDVSEAPVF 
DSGCJVTAVGT 
RQITICMQSP 
KQDTYDVHLS 
LVliIiLLVRKK 
RNDVAPTIIP 
SLSSLTSSAS 



21 
I 

CAASEPCRAV 
VQSRRSLKER 
IFYSItGPGA 
ISIIVTDG^ 
EPKDPHDLMP 
ANDNAPMFDP 
THPSSNQGIL 
VPPSKWBVQ 
LDREDEQFVR 
VRQVLZflTDK 
LSDHGNKEQL 
RKIKEPLLU' 
TPMY&PRPAH 
DQOQOYDYXiil 



31 
I 

FREAEVTLEA 
NPLBaFPSKR 
OSPPBGVFAV 
HEFKFTQDTF 
TIERSTGTIS 
QKYEAHVPai 
TTRKGIJ3FBA 
BGIPTGBPVC 
KNZYEVKVLA 
DLSPHTSPPO 
TVIRATVCDC 
EDDTRI»IVFy 
POKIGNFIIS 
EWGSRFKKLA 



41 
I 

GGAEQEPGQA 
IIifiSHKRDWV 
EKSTG&?LLLK 
RCSVLBGVLP 
VISSGLDSEK 
AVGHEVQRLT 
KNQHTLYVEV 
VYTAEDPDKE 
MnUGSPPTTG 
AQLTDOSDIY 
HGaVETC PGP 
YGSBGGGSED 
NLKAA2ITDPT 
EJMYGGG&DD 



51 

! 

USKVFMGCFG 
VAPISVPEWG 
KPL03SEIAK 
GTSVKQVTAT 
VPEYTLTIQA 
VTOUkAPKSP 
TNKAPFVLKL 
KQKISYRILR 
TGTLLLTIiID 
WTAEVNKEGD 
WKGGFXLPVL 
QDYDITQI£R 
APPYDTIiLVP 



Seq ID KO: 60 DNA sequence 

Nucleic Acid Accession #: EOS sequence 

Coding sequence: 162-428 



GGGTTCCGTT 
CATAOGGACC 
AGCACCTOGG 
TCTCCCAGAG 
AGOGAAAGAA 
GTTTACTGTT 
GTAGAGTCAT 
GAGGTTAGAA 
AGATCATAAA 



11 
I 

GGOGOCGGAT 
GGATTGTTTT 
AAGCTGAGGC 
GAAGCAGATA 
GCCTCAACTT 
TGTTCATC6A 
TAACAAGGAG 
GTCAAAGAAC 
GACATTTTTT 



21 
I 

TCGAAOGTTC 
0GCT6GCCCA 
AGCTGGTACT 
AAGCGGAAGG 
CGTCTGGAOV 
TTAGCAGAAG 
CATGTACTGG 
ATATTCTTGA 
ACACATCAGT 



31 
I 

GGACTGAOGT 
GTGTCOCOGG 
TGACAGAGAG 
CTCCCCGTGG 
AAAGTGCTGA 
AGTCCAGGAC 
CCGCAGCAAA 
AAGTTATGAT 
TAATATGGGA 



41 

1 

TTTTCTGCCT 
AGCTTGTGTG 
GATGGCGCTG 
CTTTCTAAAG 
CTTATTGGTC 
AAAaSCTTGT 
GGTAATTCTA 
GCATTCTTTT 
TTATTAAATA 



51 
1 

GAAGAAGCXrr 
CGATACAGAG 
TOGACCAtAG 
06AGTCTTCA 
CATCTGAACT 
GOGAGTAAAT 
AAGAAGAGCA 
GGGTGGTAAC 
TTGG 



Seq ID NO: 61 Protein sequence: 
Protein Accession ft: Eos sequence 

1 11 21 3 

1 1 i I 

MALSTIVSQR KQIiCRKAPRG FLKRVFKRKK 
HACASKCRVI NKEHVLAAAK VILKKSRG 



seq ID NO: 62 DNA sequence 

Nucleic Acid Accession #: NM_000094.2 

Coding sequence: 99-8933 



41 51 

I I 
LIiVHLNCLLF VHRLAEESST 



GGGCTGGAGG 
GAGGOGGGGG 
CCGCGCTCTG 
GAGTGACCTG 
CCATTGGCCG 
TCTCTGGAGC 
CAOGGACAGA 
GTGAGCTTAG 
ACCATGTCTT 
CAGACX3GGAA 
TCAAGCTATT 
CACAGCCAAC 
TGCCCCTOGT 
CGGATGACTC 
TGAGAGTACA 
CTCT6A0GGG 
GT6AGACCAG 
TTGCCCTCTA 
TAGAAGGGCX: 
GGAGTGTGCC 
CACAGCAGCA 
GCACGGACTA 
CCCTGATGGC 
CCACATCCAT 
GGOGGCGTGA 
GCTACCAGTT 
TGGAGGGCCA 
TQAGCCCTGT 
G6AGCCCAGT 
AGOGGACCCT 
GGCTTAGCTA 
TCCTCACTGT 
TGTCAGATGC 
GGATTAGCTG 
CTGCCACAGA 
TGOGAGGCAG 
CAGTGAGGAC 
GGGTTCCTGG 



11 
I 

GGCGCTGGGC 
TCCTACCTGA 
CGCG6GGATC 
CACGCGCCTT 
CAGCAATTTC 
AGCCAGTGCA 
GTTCGGCCTG 
CTACAAGGGG 
CCTGCCCCAG 
GTCCCAG6AC 
TGCTGTGGGG 
CTCCGACTTC 
TTCCCGGAGA 
GACCTCTGCT 
GTGGACAGCG 
GCTGG6ACAG 
TGTGCG6CTG 
CGCCAACAGC 
GGAACTGACC 
AGGTGCCACT 
GGAGCTGGGC 
TGAGGTGACC 
TCGCACTGAC 
CCTOCTTTCC 
GACTGGCTTG 
GGATGGGCTG 
CGAGGTGGCC 
AACAGACCTG 
CCCTGGTGCC 
GGTGCTTOCT 
CACTGTGCGG 
CCGCCGGGAG 
AACGOGAGTG 
GAGCACAGGC 
CATCACAGGG 
AGAGGAGGGC 
GGTCCATGTG 
06CCACAG6A 



21 
I 

TCGGACCTGC 
CGGCTTTTAC 
CTGGCAGAGQ 
TACGCOGCTG 
CGCGAGGTCC 
CAGGGTGTGC 
GATGCACTTG 
GGCAACACrC 
CTGGCCCGAC 
CTGGTGGACA 
ATCAAGAATG 
TTCTTCTTCG 
GTGTGCAOGA 
CCACGAGACC 
GCCAGTGGCC 
CCACTGCG6A 

ATCGGGGAGG 
ATCCAGAATA 
GGCTACCGTG 
CCTGGGCAGG 
GTGAGCACCC 
GCTTCTGTTG 
TOGAACTTGG 
GAGCCACOGC 
CAGCCGGGCA 
ACCCCTGCAA 
CAAGCCACC6 
ACCCAGTACC 
GGGAGTCAGA 
GTGTCTGCTC 
CCGGAAACrC 
AGGCrGGCXn" 
AGTGGTCCGG 
CTGCAGCCTG 
CCTGCTGCAG 
ACTCAGGCCA 
TACAGGGTTT 



31 
I 

CAAGGCCACC 
TGCCTAGGAT 
06CCCG6AGT 
ACATTGTGTT 
GCAGCTTTCT 
GCTTTGCCAC 
GCTCTGGGGG 
GCACAGGGGC 
CTGGTGTCCC 
CAGCTGCCCA 
CTGACCCTGA 
TCAATGACTT 
CTGCTGGTGG 
TGGTGCTGTC 
CTGTGACIGG 
GTGAGGGGCA 
GGCCACTGAC 
CTGTGAGOGG 
CCACAGCCCA 
TGACATGGCG 
GTTCAGTGTT 
TATTTGGCCG 
AGCAGACCCT 
TGCCTGAQGC 
AGAAGGTGGT 
CTGAGTACCG 
COGTGGTTCC 
AGCTGCCOGG 
GCATCATTGT 
CAGCATTOGA 
GAGTGGGTCC 
CACTTGCTGT 
GGGGACCOGT 
AGTCCAGCCA 
GAACCACCTA 
TCATCGTGGC 
GCAGCTCATC 
CCTGGCACTC 



41 

I 

GCAGGGGGGA 
GAOGCTGCGG 
GCGAGCCCAG 
CTTACXGGAT 
CGAAGGGCTG 
AGTGCAGTAC 
TGATGTGATC 
TGCAATTCTC 
CAAGGTCTGC 
AAGGCTGAAG 
GGAGCTGAAG 
CAGCATCTTG 
CGTGCCTGTG 
TGAGCCAAGC 
CTACAAGGTC 
GGAGGTGAAC 
CGAGTACCAA 
GACAGCTCGG 
CAGCCTCCTG 
GGTCCTCAGT 
GCTGCGTGAC 
CAGTGTGGGG 
GCGCCCGGTC 
CGGTGGCTAC 
ACTGCCCTCT 
CCTCACACTC 
CACTGGACCA 
GCAGCGGGTG 
GCGCAGCACC 
CTTGGATGAC 
CCGTGAGGGC 
TCCAGGGCTG 
CCCTGGAGCC 
GACACTGCCC 
CCAGGTGGCT 
TCGAACGGAC 
TGTCACCATT 
AGCCCAOGGC 



51 

I 

GCAAGGGACA 
CTTCTGGTGG 
CACAGGGAGA 
GGCTCCTCAT 
GTGCTGCCTT 
AGCGATGACC 
CGGGCCATCC 
CATCTGCCTG 
ATCCTGATCA 
GGGCAGGGGG 
CGAGTTGCCT 
AGGACACTAC 
ACCCGACCrC 
AGCCAATCCT 
CAGTACACTC 
GTCCCAGCTG 
GTGACTGTGA 
ACCACTGCCC 
GTGGGCTGGC 
GGTGGGCCCA 
TTGGAGCCTG 
CCCGCCACTT 
ATCCTGGGCC 
OGGTTGGAAT 
GATGTGACCC 
TACACTCTGC 
GAGCTGCCTG 
CGAGTX3TCCT 
CAGGGGGTTG 
GTTCAGGCTG 
AGTGCCAGTG 
CGG6TTGTGG 
AGTGGATTTC 
CCAGACTCTA 
GTGTOSGTAC 
CCACTGGGCC 
ACCTGGACXA 
CX3^GAGAAAT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 
180 
240 
300 
360 
420 
480 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560, 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 



210 



wo 02/086443 

CCCaOTTGGT TTCTGGGGAG GCGAOGGTGG CTGAGCTGGA TGGACTGGAG C CAGATACTG 2340 

AGTATAOGGT GCATGTGAGG GCCCATGTGG CTGGCGTGGA TGGGCCCCCT GOCTCTGTGG 2400 

TTGTGAGGAC TGCCCCTGAG OCTGTGGGTC GTGTGTOGAG GCTGCAGATC CTCftAT GCTr 2460 

CCAGGGAOGT TCTAOGGATC ACCTGGGTAG GGGTCACTGG AGCXaCAGCT TACM3ACTGG 2S20 

CCTGGGGCOG GAfTTGAAGGC GGCCCCATGA GGCACCAGAT ACTCCCAGGA AACACA GACT 2580 

CTGCAGAGAT CCGGGGTCTC GAAGGTGGAG TCAGCTACTC AGTGCGAGTG ACTGCACTTG 2640 

TCGGGGACCG CGAGGGCACA CCTGTCTCCA 'i'i G 'ITGTC A C TACGOOGCCT GAGGC TCOGC 2700 

C3«3CCCTGGG GAOGCTTCAC GTQGtGCAGC GCGGGGftGCA CTOGCIGAGG CTGOGCTGGG 2760 

AGCCGGTCCC CAGAGOGCAG GGCTTCCTTC TGCACtGGCA ACCTGAGGGT GGCCAGGAAC 2820 

AGTCCCGGGT CCTGGGGCCC GAGCTCAGCA GCTATCACCT GGACGGGCTG GAGCCAGOGA 2880 

CACACTACCG OCTGAOSCTG AGTGTCCTAG GGCOQGCTGG AGAAGGGCCC TCTGCAGAGG 2940 

TGACTTGCGOG CACTGAGTCA CCTC3GTGTTC CAA6CATTGA ACTAOGTGTG GTGGACACCT 3000 

OGATGGACTC GGTGACTTTG GCCTGGACTC CAGTOTCCAG GGCATCCAGC TACATCCTAT 3060 

CCTCGCXSGCC ACTCAGAGGC CCTGGCCRGG AAGTGCCPGG GTCCCCGCAG ACACTTCCAG 3120 

GGATCTCAAG CTCCCAGCGG GTGACAGGGC TAGAGCCTGG CGTCTCTTAC ATCTTCTCCC 3180 

TGACX3CCTGT CCTGGATGGT GTGCGGGGTC CTGAGGCATC TGTCACACAG AOGCCAGTGT 3240 

GCCCCCGTGG CCTGGCGGAT GTGGTGTTCC TACCACATGC CACTCAAGAC AATGCTCACC 3300 

GTGaSGACGC TACGAG6AGG GTCCTGGAGC GTCroGTGTT GGCACrTGGG CCTCTTGGGC 3360 

CACAGGCAGT TCRGGTTGGC CTGCTGTCTT ACAGTCA7CG GCCCTCCCCA CTGTTCCCAC 3420 

TGAATGGCTC CCATGACCTT GGCATTATCT TGCAAAGGAT CCGTGACATG CC CTAC ATGG 3480 

AOCCAAGTGG GAACAACCTG GGCACAGCOG TSGTCACAGC TCACAfiATAC ATGTTGGCAC 3540 

CAGATGCTCC TGGGOGCCXX: CAGCAOGTAC CAGtSGGTGAT GGTTCTGCPA GTGGAT6AAC 3600 

CCTTGAGAGG TGACATATTC AGCCCCATCC GTGAGGCCCA GGCTTCTGGG CTTAATGTGG 3660 

TGATGTTGGG AATGGCTGGA GCGGACOCAG AGCAGCTGCG TCGCTTGGOG CCGGGTATGG 3720 

ACTCTGTCX» GACCTTCTTC GCCGTGGATG ATGGGCCAAG CCTGGACCAG GCAGTCAGTG 3780 

GTCTCGCCAC AGCCCTGTCT CAfSGCATOCT TCACTACTCA GCOCOGGOCA OAGCCCTGCC 3840 

CAGTGTATTG TCCAAAGGGC CAGAAGGGGG AACCTGGAGA GATGGGCCTG AGAGGACAAG 3900 

TTGGGCCTCC TGGCGACCCT GGCCTCCCGG GCAGGACOGG TGCTCCOGGC CXXXAGGGGC 3960 

CXXCTGGAAG TGCCACTGCC AAGGGCGAGA GGGGCTTCCC TGGAGCA6AT GGGCGTCCA6 4020 

GCAGCCCTGG CCGCGCCGGG AATCCTGGGA CCCCTGGAGC CCCTGGCCTA AAGGGCTCTC 4080 

GAGGGTZGOC TGG0CCTO6T G6GGACCCGG GAGAGCGAGG ACCTOGAGGC CCAAAGGGGG 4140 

AGCCGGGGGC TCCX36GACAA GTCATCGGAG GTGAAGGACC TGGGCTTCCT GGGCQGAAAG 4200 

GGGACCCTGG ACCATOSGGC CCCCCTGGAC CTGGTGGACC ACTGGGGGAC CCAGGACCCC 4260 

GTGGCCCCCC AGGGCTTCCT GGAACAGCCA TGAAGQ6TGA CAAAGGC3GAT GGTGGGGAGC 4320 

GGGGTCCCCC TGGACCAGGT GAAGGTGGCA TTGCTCCTGG GGAGCCTGGG CTGCCGGGTC 4380 

TTCCCGGAAG CCCTGGACCC CAAGGCCCOG TTGGCCCCCC TG6AAAGAAA GGAGAAAAAG 4440 

GTGACTCTGA GGATGGAGCT CCAGGCCTCC CAGGACAACC TGGGTCTCCG GGTGAGCAGG 45 00 

GCCCACGGGG ACCTCCTGGA GCTATTGGCC CCAAAGGTGA CCGGGGCTTT CCAGGGCCCC 4S60 

TGGGTGAGGC TGGAGAGAAG 0G06AA0GTG GACCCGCAGG CCCAGCGGGA TCCCGGGGGC 4620 

TGCCAGGGGT TGCTGGAOGT CCTGGAGCCA AGGGTCCTGA AGGGCCAGCA GGACCCACTG 4680 

GGOGCCAAGG AGAGAAGGGG GAGCCTGGTC GCCCTGGGGA CCCTGCAGTG GTGGGACCTG 4740 

CTGTTGCTGG AOCCAAAGGA GAAAAGGGAG ATGTGGGGCC CGCTGGGCCC AGAGGAGCTA 4800 

CCGGAGTCCA AGGGGAACGG GGCCCACCCG GCTTGGTTCT TCCTGGAGAC CCTGGCCCXA 4860 

AGGGAGACCC TGGAGACOGG GGTCCCATTG GCCTTACTGG CAGAGCAGQA CCCCCAGGTG 4920 

ACTCAGGGCC TCCTGGAGAG AAGGGAGACC CTGGGCGGCC TGGCCCCCCA GGACCTGTTG 4980 

GCCCCCGAGG ACGAGATGGT GAAGTTGGAG A6AAAGGTGA OGAGGGTCCT CCGGGTGACC 5040 

GGGGTTTGCC TGGAAAAGCA GGCGAGOGTG GCCTTCGGG6 GGCAOCTGGA GTTOGGGGGC 5100 

CTGTGGGTGA AAAGGGAGAC CAGGGAGATC CTGGAGAGGA TGGAOGAAAT GGCAGCCCTG 5160 

GATCATCTGG ACCCAAGGGT GACCGTGGGG AGCCGGGTCC CCCAGGACCC CCGGGACGGC 5220 

TGGTAGACAC AGGACCTGGA GCCAGAGAGA AGGGAGAGCC TGGGGACCGC GGACAAGAGG 5280 

GTCCTCGAGG GCCCAAGGGT GATCCTGGCC TCCCTGGAGC CCCTGGGGAA AGGGGCATTG 5340 

AAGGGTTTC6 GQGACCCCCA GGCCCACAGG GGGACCCAGG TGTCOGAGGC CCAGCAGGAG 5400 

AAAAGGGTQA COSGGGTCCC CCTGGGCTGG ATQGCC3GGAG 0GGACTG6AT GGQAAACCAG 5460 

GAGCTCCTGG GCCCTCTGGG CCGAATGGTG CTGCAGGCAA AGCTGGGGAC CCAGGGAGAG 5520 

ACGGGCTTCC AGGCCTCCGT GGAGAACAAG GCCTCCCTGG CCCCTCTGGT CCCCCTGGAT 5580 

TACCGGGAAA GCCAGGCGAG GATGGGAAAC CTGGCCTGAA TGGAAAAAAC GGAGAACCTG 5640 

GGGACCCTGG AGAAGACGGG AGGAAGGGAG AGAAAGGAGA TTCAGGCGCC TCTGGGAGAG 5700 

AAGGTOGTGA TGGCCCCAAG GGTGAGCGTG GAGCTCCTGG TATCCTTGGA CCCCAGGGGC 5760 

CTCCAGGCCT CCCAGGGCCA GTGGGCCCTC CTGGCCAGGG TTTTCCTGGT GTCCCAGGAG 5820 

GCACGGGCCC CAAGGGTGAC CGTGOGGAGA CTGGATCCAA AGGGGAGCAG GGCCTCCCTG 5880 

GAGAGCGTGG CCTGCGAGGA GAGCCTGGAA GTGTGOCGAA TGTGQATCGG TTGCTGGAAA 5940 

CTGCTGGCAT CAAQGCATCT GCCCTGCGGG AGATCGTGGA GACCTGGGAT GAGAGCTCTG 6000 

GTAGCTTCCT GCCTGTGCCC GAACGGOGTC GAGGCCCCAA GGGGGACTCA GGCGAACAGG 6060 

GCCCCCCAGG CAAGGAGGGC CCCATOGGCT TTCCTGSAGA AOGGGGGCTG AAGGGCGACC 6120 

GTGGAGACCC TGGCCCTCAG GGGCCACCTG GTCTG6CCCT TGGGQAGAGG GGCCCCCCCG 6180 

GGCCTTCCGG CCTTGCOGGG GAGCCTGGAA AGCCTGGTAT TCCCC3GGCTC CCAGGCAGGG 6240 

CTGGGGGTGT GGQAGAGGCA GGAAGGCCAG GAGAGAGGGG AGAACGGGGA GAGAAAGGAG 6300 

AACGTGGAGA ACAGGGCAGA GArGGCCCTC CTGGACTCCC TX3GAACCCCT GGCCCCCCCG 6360 

GACCCCCTGG CCCCAAGGTG TCTGTGGATG AGCCAGGTCC TGGACTCTCT GGAGAACAGG 6420 

GACCCCCTGG ACTCAAGGGT GCTAAGGGGG AGCCGGGCAG CAATGGTGAC CAAGGTCCCA 6480 

AAGGAGACAG GGGTGTGCCA GGCATCAAAG GAGACCGGGG AGAGCCTGGA CCGAGGGGTC 6540 

AGGAOGGCAA CCGGGGTCTA CCAGGAGA6C GTGGTATGGC TGGGCCTGAA GGGAAGCGGG 6600 

GTCTGCAGG6 TGCAAGAGGC CCCCCTGGCC CAGTGGGTGG TCATGGAGAC GCTGGACCAC 6660 

CTGGTCCCCC GGOTCTTGCT GGCCCT6CAG GACCCCAAGG ACCTTCTGGC CTGAAGGGGG 6720 

AGCCTGGAGA GACAGGACCT CCAGGAOGGG GCCTGACTGG ACCTACTGGA GCTGTGGGAC 6780 

TTCCTGGACC CCCOGGCCCT TCAGGCCTTG TGGGTCCACA GGGGTCTCCA GGTTTGCCTG 6840 

GACAAGTGGG GGAGACAGGG AAGCCGGGAG CCCCAGGTCG AGATGGTGCC AGTGGAAAAG 6900 

AT6GAGACAG AGGGAGCCCT GGTGTGCCAG GGTCACCAGG TCTGCCTGGC CCTGTCGGAC 6960 

CTAAAGGA6A ACCTGGCCCC ACGGGGGCCC CTGGACAGGC TGTGGTCGGG CTCCCTGGAG 7020 

CAAAGGGAGA GAAGGGAGCC CCTGGAGGCC TTGCTGGA6A CCTGGTGGGT GAGCCGGGAG 7080 

CCAAAGGTGA COGAGGACTG CCAGGGCCCC GAGGCGAGAA GGGTGAAGCT GGCCGTGCAG 7140 

GGGAGCCCGG AGACCCTGGG 6AAGATGGTC AGAAAGGGGC TCCAGGACCC AAAGGTTTCA 7200 

AGGGTCACCC AGGAGTCGGG GTCCCGGGCT CCCCTGGGCC TCCTGGCCCT CCAGGTGTGA 7260 

AGGGAGATCT GGGCCTCCCT GGCCTGCCCG GTGCTCCTGG TGTTGTTGGG TTCCOCGGTC 7320 

AGACAGGCCC TCGAGGAGAG ATGGGTCAGC CAGGCCCTAG TGGAGAGCGG GGTCTGGCAG 7380 

GCCCCCCAGG GAGAGAAGGA ATCCGAGGAC CCCTGGGGCC AOCTGGACXA C06GGGTCAQ 7440 

TGGGACCACC TGGGGCCTCT GGACTCAAAG GAGACAAGG6 A6ACGCTGGA GTAGGGCTGC 7500 
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CTGGGCCCCG AGGCGAGOST GGGGAGCCAG GCATOOGGGG TGAAGATGGC 0GCC0CX3GCC 7560 

AGGAGGGACC C0GAG6ACTC AOGGGGOCOC CTGGCAfiCAG GGGAGAGOST OGGGAGAAGG 7620 

GTGATGTTGG 6AGTGCAGGA CTAAAOGGTG ACAAGGC3AGA CTCAGCIGTG ATGCTGGGGC 7680 

CTCCftGGCCC AOGGGGTGCC AAGGGGSWa W3GGTGAA0G AGGGCCTOSG GGCTTGGATG 7740 

GTGACAAAGG ACCTOGGGGA GACAATGGGG ACCCTGGTGA CAAGGGCACC AAGGGAGAGC 7300 

CroGTGACAA GGGCTCAGCC GGGTTGCCAG GACTGCGTGG ACTCCTGGGA CCCCAGGGTC 7860 

AACCTCGIGC AGCACGGATC CCTGGTGACC OGGGATCOCC AGGAAAGGAT GGAGTGCCTG 7920 

GTATCC6AGG AGAAAAAGGA GATGTTGGCT TCATGGGTOC COGGGGCCTC AAGGGTGAAC 7980 

GGGGAGTCAA GGGAGCCIGT GGCCTT6AT6 GAGAGAAGGG AGACAAGGGA GAA6CTGGTC 8040 

CCCCAGGCOG CCCOSGGCTG GCAGGACACA AAGGAGAGAT GGGQGAGCCT GGTGTGOOGG 8100 

GCCAGTCGGG GGCCCCTGGC AAGGAGGGCC TGATOGGTCC CAAGGGTGAC CGAGGCTTTG 8160 

ACGGGCAGCC AGGCCCCAAG GGTGACCAGG GCGAGAAAGG GGAGCGGGGA ACCCCAGGAA 8220 

TTCGGGGCTT CCCAGGCCCC AGTGGAAATG ATGGCTCTGC TGGTCCXXXIA GGGCCACCTG 8280 

GCAGTGTTOG TCCCAGAGGC CCCGAAGGAC TTCAGGGCCA GAAGGGTGAG CGAGGTCCCC 8340 

COGGAGAGAG AGTGGTGGG3 GCTCCTGGGG TCOCTGGACC TCCTGGOGAC AGAGGGGAGC 8400 

AGGGGCGGCC AGGGCCTGCC GGTCCTOGAG GOGAGAAGGG AGAAG CTGC A CTGAOCSGAGG 8460 

ATGACATOX; GGGCTTTGTG CGCCAAGAGA TGAGTCAGCA CTGTGOCTGC CAGGGCX^AGT 8S20 

TCATCX5CATC TGGATCACGA CCCCTCCCTA GTTATCCTGC AGACACTGCC GGCTCCCAGC 8580 

TCCATGCTCT GCCTGTGCTC CGCGTCTCTC ATGCAGAGGA GGAAGAfiCGG GTACCCCCTG 8640 

AG6ATGATCA GTACTCTGAA TACTCCGAGT ATTCTGTGGA GGAGTAOCAG GACOCTGAAG 8700 

CTCCTTGGGA TAGTGATGAC CCCTGTTCOC TGCCACTGGA TGA0G6CTCC TGCACTGCCT 8760 

ACACCCTGOG CTGGTACCAT CGGGCTGTGA CAG6CAGCAC AGAGGCCTGT CACCCTTTTG 8820 

TCTATGGTGG CTGTGGAGGG AATGCCAACC GTTTTGGGAC CCGTGAGGCC TGC6AG0GCC 0880 

GCTGCCCACC CCGGGTGGTC CAGAGCCAGG GGACAGGTAC TGCCCAGGAC TGAGGCCCAG 8940 

ATAATGAGCT GAGATTCAGC ATCXXTCTGCA GGAGTCGGGG TCTCAGCAGA ACCCCACTGT 9000 

CCCTCCCCTT GOTGCTAGAG GCTTGTGTGC ACGT6AGCGT GCGAGTGCAC GTtXGTTATT 9060 

TCRGTGACTT GGTCCOGTGG GTCTAGCCTT CCCCXXTTCTG GACAAACXXZC CATTGTGGCT 9120 

CCTGCCACCC TGGCAGATGA CTCACTGTGG GGGGGTGGCT GTGGGCAGTG AGOGGATGT6 9180 

ACTCGCGTCT GACCCGCCCC TTGACCCAAG CCTGTGATGA CATGGTGCTG ATTCTGGGGG 9240 
GCATTAAAGC TGCTGTTTTA AAAGGCAAAA AA 

Seq ID NO: 63 Protein sequence: 
Proteia Accession |: NP_000085.l 

1 II 21 31 41 51 

1 1 I I i i 

MTLRLLVAAL CAGILAEAPR VRAQHRERVT CTRIiYAADIV FLLDGSSSIG RSNFREVRSP 60 

LEGLVLPFSG AASAQGVRFA TVQYSDDPRT EPGLDALGSG GDVIRAIREIi SYKBGNTRTG 120 

AAILHVADHV PIiPQLARPGV PKVCILITDG KSQDLVDTAA QBLKGQGVKL PAVGIKNADP 180 

EELKRVASQP TSDFFPPVMD FSIIiRTLLPL VSRRVCTTAG GVPVTRPPDD STSAPRDLVL 240 

SEPSSQSLRV QWTAASGPVT GYKVQYTPLT GWSQPLPSER QEVNVPAGBT SVRUIGLRPL 300 

TEYQVTVIAL YAKSIGEAVS GTARTTALBG PEIiTIQNTTA HSLLVAWfiSV FGATGYRVTW 360 

RVLSGGFK3Q QELGPGQGSV liliRDLEPGTD' YEVTVSTI*FG RSVGPATSLM ARTDASVEQT 420 

LRPVILGPTS ILLSWIJLVPE ARGYRLEWRR ETGI£PPQKV VLPSDVTRYQ LDGLQPGTBY 480 

RLTLYTLLEG HEVATPATW PTGPELPVSP VTDIjQATELP GQRVRVSWSP VPGATQYRII 540 

VRSTXJGVERT LVLPGSQTAF DLDDVQAGLS YTVRVSARVG PREGSASVLT VRREPETPLA 600 

VPGLRWVSD ATRVRVAWGP VPGASGPRIS WSTGSGPESS QTLPPDSTAT DITGLQPGTT 660 

YQVAVSVLRG REBGPAAVIV ARTDPL6PVR TVHVTQASSS SVTITWTRVP GATGVRVSWH 720 

SAHGPEKSQI. VSGEATVAEL DGLEPDTEYT VHVRAHVAGV DGPPASVWR TAPEPVGRVS 780 

RLQILNASSD VLRITWVGVT GATAYRLAWG RSEGGPMRHQ ILPGNTDSAE IRGLEGGVSY 840 

SVRVTALVGD REGTPVSIW TTPPEAPPAL GTLHWQRGE HSUILRWEPV PRAQGFLLHW 900 

QPEQGQEQSR VLGPBLSSYH LDGIiEPATQY RVRLSVUffiA GBGPSAEVTA RTESPRVPSI 960 

ELRWDTSID SVTLAWTPVS RASSYILSWR PLRGPGQEVP GSPQTLPGIS SSQRVTGIiEP 1020 

GVSYIPSLTP VLDGVRGPEA SVTQTPVCPR 6LADWPLPH ATQDNAHRAB ATRRVLERl*V 1080 

liALGPLGPQA VQVGLLSYSH RPSPLPPLNG SHDLGIILQR IRDMPYMOPS ONZniGTAWT 1140 

AHRYMIAPOA PGRKQHVPGV MVLLVDEPLR GOIFSPIREA QASGLNWMXj QIAGADPEQL 1200 

RRLAPGMDSV QTFFAVDDGP SLDQAVSGLA TALOQASPTT QPRPEPCPVY CPKGQKGBPG 1260 

EMGLRGQVGP PGDPGLPGRT GAPGPQGPPG SATAKGERGP PGADGRPGSP GRAGTTPGTPG 1320 

APGLiQCSPGL PGPRGDPGER GPRGPKGEPG APGQVIGGEG PGLPGRKGDP GPSGPPGPRG 1360 

PLGDPGFRGP PGtiPGTAMKG DKCaSRGERGP PGPGEGGIAP 6EPGLPGLPG SPGPQGPVGP 1440 

PGKKGCKGDS EDGAPGLPGQ PGSPGEQGPR GPPGAXGPKG DRGFPGPLGE AGBKGERGPP 1500 

GPAGSRGLPG VAGRPGAKGP BGPPGPTGRQ GEKGEPGRPG DPAWGPAVA GPKGBKGDVG 1560 

PAGPRGATGV QGERGPPGLV LPGDPGPKGD PGDRGPIGLT GRAGPPGDSG PPGBKGDPGR 1620 

PGPPGFVGPR GRDGEVGBKG DEGPPGDPGL PGKAGERGLR GAPGVRGPVG EKGDQGDPGK 1680 

OGRNGSPGSS GPKGDRGEPG PPGPPGRLVD TGPGAREKGE PCTRGQBGPR GPKGDPGLPG 1740 

AP6ERGIBGF RGPPGPQGDP GVRGPAGEKG DRGPPGUJGR SGLDGKPGAA GPSGPNGAAG 1800 

KAGDPGRDGL PGLRGBQGU? GPSGPPGLPG KPGEDGKPGL NGKNGEPGDP GEDGRKGEKG 1860 

DSGASGRBGR DGPKGERGAP GILGPQGPPG LPGPVGPPGQ GPPGVPGGTG PKGDRGETGS 1920 

KGE3QGLPGER GLRGEPGSVP NVDRLLETAG IKASAIiRBIV ETHDESSG5P LPVPERRRGP 1980 

KGDSGEQGPP GKEGPIGFPG ERGLKGDRGD PGPQGPPGLA LGERGPPGPS GLAGEPGKPG 2040 

IPGIiPGRAGG VGEAGRPGER GERGEKGERG EQGRDGPPGI» PGTPGPPGPP GPKVSVDEPG 2100 

PGLSGEQGPP GLKGAXGEPG SNGDQGP2CG0 RGVPGIKGDR GEPGPRGQDG NPGLPGERGM 2160 

AGPBGKPGLQ GPSGPPGFVG GRGDPGPPGA FGLAGPAGPQ GPSGLKGEP6 GTGPPGRGLT 2220 

GPTGAVGLPG PPGPSGLVGP QGSPGLPGQV GETGKPGAPG RDGASGKDGD RGSP6VPGSP 2280 

GLPGPVGPKG EPGPTGAPGQ AWGLPGAKG EKGAPGGLAG DLVGEPGAKG DRGLPGPRGE 2340 

KGEAGRAGEP GDPGEDGQKG APGPKGFKGD PGVGVPGSPG PPGPPGVKGD LGLPGLPGAP 2400 

GWGFPGOTG PRGEMGQPGP SGERGLAGPP GREGIPGPLG PPGPPGSVGP PGASGLKGDK 2460 

GDPGVGLPGP RGERGEPGIR GEDGRPGQEG PRGLTGPPGS RGERCEKGDV GSAGLKGDKG 2520 

DSAV2LGPPG PRGAKGDMGE RGPRGLDGDK GPRGDNOPG DKGSKGEPGD KGSAGLPGLR 2580 

GLLGPQGQPG AAGIPGOPGS PGKDGVPGIR GEKGDVGFMG PRGLKGERGV KGAOGLOGEK 2640 

CSKGEAGPPG RPGLAGHKGE MGEPGVPGQS GAPGKBGLIG PKGDRGFDGQ PGPEQCajQGER 2700 

GERGTPGIGG PPGPSGNDGS AGPPGPPGSV GPRGPBGLQG QKGERGPPGE RWGAPGVPG 2760 

APGERGEQGR PGPAGPRGEK GEAALTEDDI RGPVRQEMSQ HCACQGQPIA SGSRPLPSYA 2820 

ADTAGSQLHA VPVLHVSHAE EEERVPPEDD EYSEYSEYSV EEYQDPEAPW DSDDPCSLPL 2880 

OBGSCTAYTL RWYHHAVT6S TEACHPPVYG GOCSGMANRFG TREACERRCP PRWQSQGTG 2940 
TAQD 
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Seq ID NO: 64 DNA sequence 

nucleic Acid Accession 8: MM_006945 

Coding sequence: 1-219 

1 11 21 31 41 51 

1 i I i 1 I 

ATGTCrrATC AACAGCAGCA GTGCAAGCAG CCCTGCCAGC CAOCTCCTG? GTGCCCCAOG 60 
CCAAAGTGCC CAGAGOCATG TCCACCCCCG AAGTGCCCTG AGOCCT GCCC ACCAOCAAAfi 120 
TCTCCACAGC CCTGCOCAOC TCAGCAGTGC CAGCAjGAAAT ATCCTCCTGT GACACCTTGC 180 
CCACCCTGCC AGCCAAAGTA TCCACOGAAG AGCAAGTAA 

Seq ID NOc 65 Protein sequence: 
Protein Accession fi: NP_008876 

I 11 21 31 41 51 

1 I I 1 I I 

MSYQQQQCKQ POOPPPVCPT PKCPSPCPPP KCPBPCPPPK CPQPCPPQQC QQKYPPVTPS 60 

PPOQPKYPPK SK 



Seq ID NO: 66 DNA sequence 

Nucleic Acid Accession ft: NM_005629.l 

Coding sequence: 639>2546 

1 11 21 31 41 51 

[till! 

TACJTCGGAGC GAGGTGGCGA GTOGCTGAGC CCGCOGOC3GC CCOGAGAGOG GCTGCAGCCG 60 

CCGCCGCCGG GAAGGAGAGG GCGAGGCC5CG CCCGAGCCGC OGCCGCCGCC GCCACCGCCG 120 

COGCOGCCAC CACCGCCACC GGAGTCGCGG GCCAGCCGGG CAGCCTCOGC GGGCCCCGGC 180 

CGGGGCGGGG GGCGCGGGCC ACAGGCOCCT GCTCOGGOCG TGGTTTGCA6 ACCGGGGGOG 240 

COGATGTCGC CCGOGCCCCG TTAGGATGAG TCTOGGGTOG GGCGAGGAGC OGCCGCACCC 300 

GCCGCCGCCC GAGCCGCGGG CAGGAGCCTC GGGAGCCGCC GCCGCCGCCG CCGCCGCCGG 360 

GCOGGGCCCC GCCGCCGCCC GCGCGCCCCC GGGCCCCOGA CACACATGAG ATTCTTCAGG 420 

CTCACTTTCA AGTGCTTCGT GGACTGCTTC TGACTGCGCC GCCCGCGCCC OGCACCCCGC 480 

CGTCOGCCOG CCGCCCCGTC CCCCGGCCCG GCCGCCCCCC GGCCCCCGGC CGGCCCGCGC 540 

CCT0GG6GCC CTCCCCGGTG 0CGCCGGT6C CCCCCGCCTG ACCGCCGCCC CCCGTGAGGC 600 

GCCGOGACCC CGGCCCGGCC GTGCGGCCCG CGGGGGOCAT GGCGAA6AAG AGOGCOGAGA 660 

ACGGCATCTA TAGCGTGTCC GGCGACGAGA AGAAGGGCCC CCTCATC6CG COCGGGQCCG 720 

ACGGGGCCCC GGCCAACGGC GACGGCCCCG TGGGCCTGGG GACACCCGGC GGCCGCCTG6 780 

CCGTGCCGCC GCGCGAGACC TGGACGCGCC AGATGGACTT CATCATGTCG TGCGTGGGCT 840 

TCGCCGTGGG CTTGGGCAAC GTGTGGCGCT TCCCCTACCT GTGCTACAAG AACGGCGGAG 900 

GTGTGTTCCT TATTCCCTAC GTCCTGATCG CCCTGGTTGG AGGAATCCCC ATTTTCTTCT 960 

TAGAGATCTC GCIGGGCCAG TTCATGAAGG COGGCAGCAT CAATGTCTGG AACATCTGTC 1020 

CCCTGTTCAA AGGCCTGGGC TACGCCTCCA TGGTGATOGT CTT CTAC TGC AACACCTACT 1080 

ACATCATGGT GCTGGCCTGG GGCTTCTATT ACCTGGTCAA GTCCTTTACC ACCAOSCTGC 1140 

CCTGGGCCAC ATGTGGCCAC ACCTGGAACA CTCCCGACTG CGTGGAGATC TTCCGCCATG 1200 

AAGACTGTGC CAATGCCAGC CTGGCCAACC TCACCTGTGA CCAGCTTGCT GACOGCCGGT 1260 

COCCTGTCAT OGAGTTCTGG GAGAACAAAG TCTTGAGGCT GTCTGGGGGA CTGGAGGTGC 1320 

CAGGGGCCCT CAACTGGGAG GTGACCCTTT GTCTGCTGGC CTGCTGGGTG CTGGTCTACT 1380 

TCTGTGTCTG GAAGGGGGTC AAATCCACGG GAAAGATCGT GTACTTCACT GCTACATTCC 1440 

CCTACGTGGT CCTGGTCGTG CTGCIGGTGC GTGGAGTGCT GCTGCCTGGC GCCCTGGAT6 IS 00 

GCATCATTTA CTArCTCAAG CCTGACTGGT CAAAGCTGGG GTCCCCTCAG t/itS'ltSGATAG IS 60 

ATGCGGGGAC CCAGATTTTC TTTTCTTACG CCATTGGCCT GGGGGCCCTC ACAGCCCTGG 1620 

GCAGCTACAA CCGCTTCAAC AACAACTGCT ACAAGGAGGC CATCATCCTG GCTCTCATCA 1680 

ACAGTGGGAC CAGCTTCTTT GCTGGCTTCG TGGTCTTCTC CATCCTGGGC TTCATGGCTG 1740 

CAGAGCAGGG CGT6CACATC TCCAAGGTGG CAGAGTCAGG GC06GGCCT6 GCCTTCATCG 1800 

CCTACCCGCG GGCTGTCAOG CTGATGCCAG TGGCCCCACT CTGGGCT6CC CT6TTCTTCT 1860 

TCATGCIGTT GCTGCTTGGT CTCGACAGCC 7«3TTTGTAGG TGTGGAGGGC TTCATCACCG 1920 

GCCTCCTOGA CCTCCTCCCG GCCTCCTACT ACTTCCGTTT CCAAAGGGAG ATCTCTGTGG 1980 

CCCTCTGTTG TGCCCTCTGC TTTGTCATCG ATCTCTCCAT GGTGACTGAT GGCGGGATGT 2040 

AGGTCTTCCA GCTGTTTGAC TACTACTCGG CCAGCGGCAC CACCCTGCTC TGGCAGGCCT 2100 

TTTOGGAGTG CGTG6TGGTG GCCTGGGTGT ACGGAGCTGA CCGCTTCATG GACGACATTG 2160 

CCTGTATGAT CGGGTACCGA CCTTGCCCCT GGATGAAATG GTGCTGGTCC TTCTTCACCC 2220 

CGCIGGTCTG CATGGGCATC TTCATCTTCA AGGTTGTGTA CTAGGAGCCG CTGGTCTACA 2280 

ACAACACCTA CGTGTACCOG TGGTGGGGTG AGGCCATGGG CTGGGCCTTC GCCCTGTCCT 2340 

CCATGCTGTG CGTGCCGCTG CACCTCCTGG GCTGCCTCCT CAGGGCCAAG GGCACCATGG 2400 

CTGAGCGCTG GCAGCACCTG ACCCAGCCCA TCTGGGGCCT CCACCACTTG GAGTACCGAG 2460 

CTCAGGACGC AGATGTCAGG GGCCTGACCA CCCTGACCCC AGTGTCCGAG AGCAGCAA6G 2520 

TOGTCGTGGT GGAGAGTGTC ATGT6ACAAC TCAGCTCACA TCACCAGCTC ACCTCTGGTA 2S80 

GCCATAGCAG CCCCTGCTTC AGCCCCACCG CACCCCTCCA GGGGGCCTGC CTTTCCCTGA 2640 

CACTTTTGGG GTCTGCCTGG GGGAGGAGGG GAGAAAGCAC CATGAGTGCT CACTAAAACA 2700 

ACrrTTTCCA TTTTTAATAA AACGCCAAAA ATATCACAAC CCACCAAAAA TAGATGCCTC 2760 

TCCCCCTCCA GCCCTAGCOG AGCTGGTCCT AGGCCCCGCC TAGTGCCCCA CCCCCACCCA 2820 

CAGTGCTGCA CTCCTCCTGC CCCTGCCACG CCCACCCCCT GCCCACCTCT CCAGGCTCTG 2880 

CTCTGCAGCA CACCCGTGGG TGACCCCTCA CCCCAGAAGC AGCAGTGGCA GCTTGGGAAA 2940 

TGTGAGGAAG GGAAGGAOGG AGAGACGGGA GG6AGGAGAG AGAGGA6AAG GGAGGCAGGG 3000 

GAGGGGCAGC AGAACCAAGG CAAATATTTC AGCTGGGCTA TACCOCTCTC CCCATGCCTG 3060 

TTATAGAAGC TTAGAGAGCC AGCCAGCAAT GGAACCTTCT GGTTCCTGCG CCAATOGCCA 3120 

CCAGTATCAA TTGTGTGAGC TTGGGTGCGA GTGCACGOGT GOGTGAGTAC GGAGAGTATA 3180 

TATAGATCTC TATCTCTTAG CAAAGGTGAA TGCCAGATGT AAATGGCGCC TCTQGGCAAA 3240 

GGAGGCTTGT ATTTTGCACA TTTTATAAAA ACTTGAGAGA ATGAGATTTC TGCTTGTATA 3300 

TTTCTAAAAA CAG6AAGGAG CCCAAACCAT GCTCTOCTTA CCACTCCCAT CCCTGTGAGC 3360 

CCTACCTTAC CCCTCTGCCC CTAGCCAAGG AGTGTGAATT TATAGATCTA ACTTTCATAG 3420 

GCftAAACAAA AGCTTOGAGC TGTTGOGTGT GTGAGTCTGT TGTGTGGATG TGOGTGTGTG 3480 

GTCCCCAGCC CCAGACTGGA TTGGAAAAGT GCATGGTGGG GGCCTCGGGG CTGTCCCCAC 3540 

GCTGTCCCTT TGCCACAAGT CTGTGGGGCA AGAGGCTGCA ATATTCCGTC CTGGGTGTCT 3600 

GGGCTGCTAA OCTGGCCTGC TCAGGCTTCC CACCCTGTGC GGGGCACACC CCCAGGAAOG 3660 

GACOCTGGAC AOGGCTCCCA CGTCCAGOCT TAAGGTG6AT GCACTTOCCG CACCTCCAGT 3720 
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CrrCTGTGTA GCAGCTTTAA CCCACGTTTG TCTGTCAOGT CCAGTCCCGA GACGGCTGAG 3780 
TX3ACC0CAAG AAAGGCrTCC OOGACACCCA GACAGAGGCT GCAGGGCTGG GGCTGGGTGA 3840 
GGGTGGOGGG CXTTGCGGGGA CATTCTACTG TQCTMAPAC CCACTGCAGA CATAGCAATA 3900 
AAAACATGTC ATTTTCC 

Seq 10 KOs 67 Protein sequence: 
Protein Accession §: NP_005620.1 

1 11 21 31 41 51 

I i 1 i 1 i 

MAKKSAESTGI ySVSGD£2CKG PLIAPGPDGA PAKGDGPVGL GTPGGRIAVP PRETVTRQMD 60 

PIMSCVGFAV GL(2JWRPPY LCYKNGGGVP LIPYVLIALV GGIPIFFLEI SLGQFMKAGS 120 

INVWKICPLF KdiCYASMVI VFYCNTYYIM VLRWCFVYLV KSFTTTLPWA TCCaTWNTPD 180 

CVEIFRKEDC AHASIAKLTC DQIADWISPV lEFWEKKVLR LSGGLEVPGA UnrEVTLOjL 240 

ACWVLVYFCV WKGVKSTGKI VYFTATPPYV VLWLLVRGV LLPGALDGII YYLKPDWSKL 300 

GSPQVHIDAG TQIFFSYAIG LGALTAIiGSY NRFMNNCYKD AIIIALINSG TSFFAGPWF 360 

SILGFMAABQ GVHISKVAES GPGIiAFIAYP RAVTLMPVAP LMAAI.PFPKL LLI^LDSQFV 420 

GVEGFITCIiL DLLPASVYFR FQREISVALC CALCFVIDZ^ MVTDGGMYVF QLFDYYSASG 480 

TTLIiMQAFHE CWVAMVYGA DRFMDDIACM IGYRPCPWMK WCWSFFTPLV CKGIFIPNW 540 

YYEPLVYNNT YVYPWWGBAM GWAPALSSML CVPLHLLGCL LRAKGTMABR KQHLTQPIWG 600 
LHHLEYRAQD ADVRGLTTLT PVSESSKVW VESVM 



Seq ID NO: 68 DNA sequence 

Nucleic Acid Accession ft; NM_0219S3.1 

Coding sequence: 178-2469 

1 11 21 31 41 51 

1 1 I i I ) 

GGCACGAGGG GGACCCGGCC GGTCCGGCGC GAGCCCCOGT CCGGGGCCCT GGCTCGGCCC 60 

CCAGGTTGGA G6AGCCCX3GA GCCCGCCTTC GGAGCTACGG CCTAACGGCX3 GCS3GCGACTG 120 

CAGTCTGGte GGTCCACACT TGTGATTCTC AATGGAGAGT GAAAACGCAG ATTCATAATG 180 

AAAGCTAGCC CCXX3TCGGCC ACTGATTCTC AAAAGAOGGA GGCTGCCCCT TCCTGTTCAA 240 

AATGOCCCAA GTGAAACATC AGAGGAGGAA CX:TAAGAGAT CCCCTGCCCA ACACGAGTCT 300 

AATCAAGCAG AGGCCTCCAA GGAAGTGGCG GAGTCCAACT CTTGCAAGTT TCCRGCTGGG 360 

ATCAAGATTA TTAACCACCC CACCATGCCC AACAOGCAAG TAGTGGCCAT CCCCAACAAT 420 

GCTAATATTC ACAGCATCAT CACAGCACTG ACTGCCAAGG GAAAAGAGAG TGGCAGTAGT 480 

GGGCCCSVACA AATTCATCCT CATCAGCTGT GGGGGAGCCC CAACTCAGCC TCCAGGACTC 540 

C3GGCCTCAAA OCCAAACCAG CTATGATGCC AAAAGGACAG AAGTGACCCT GGAGACCTTG 600 

GGACCAAAAC CTGCAGCTAG GGATGTGAAT CTTCCTAGAC CACCTGGAGC CCTTTGCGAG 660 

CAGAAAOGGG AGACCTGTGC AGATGGTGAG GCAGCAGGCT GCACTATCAA CAATAGCCTA 720 

TCCAACATCC AGTGGCTTCG AAAGATGAGT TCTGATGGAC TGGGCTCCCG CAGCATCAAG 780 

CAAGAGATGG AGGAAAAGGA GAATTGTCAC CTGGAGCAGC GACAGGTTAA GGTTGAGGAG 840 

CCTTCGAGAC CATCAGCGTC CTGGCAGAAC TCTGTGTCTG AGCXX5CCACC CTACTCTTAC 900 

ATGGCCATGA TACAATTCGC CATCAACAGC ACTGAGAGGA AGCGCATGAC TTTGAAAGAC 960 

ATCTATACGT GGATTCAGGA CCACTTTCCC TACTTTAAGC ACATTGOaA GC CAGGCTG G 102O 

AAGAACTCCA TCCGCCACAA CCTTTCCCTG CAOGACATGT TTGTCOGGGA GAOGTCTGCC 1080 

AATGGCAAGG TCTCCTTCTG GACCATTCAC CCCAGTGCCA ACC3GCTACTT GACATTGGAC 1140 

CAGGTGTTTA AGCCACTGGA CCCAGGGTCT CCACAATTGC CCGAGCACTT GGAATCACAG 1200 

CAGAAACGAC CGAATCCAGA GCTCCX^CCXW AACATGACCA TCAAAACCGA ACTCCCCCTG 1260 

GGCGCACGGC GGAAGATGAA GCCACTGCTA CCACX3GCTCA GCTCATACCT GGTACCTATC 1320 

CAGTTCCGGG TGAACCAGTC ACTGGTGTTG CAGCCCTOGG TGAAGGTGCC ATTGCCCXTTG 1380 

GCGGCTTCCC TCATGAGCTC AGAGCTTGCC 06CCATAGCA AGCGAGTOCG CATTGCCCCC 1440 

AAGGTGCTGC TAGCTGAGGA GGGGATAGCT CCTCTTTCTT CTGCAGGACC AGGGAAAGAG 1500 

GAGAAACrCC TGTTTGGAGA AGGGTTTTCT CCTTTGCTTC CAGTTCAGAC TATCAAGGAG 1560 

GAAGAAATCC AGCCTGGGGA GGAAATGCCA CACTTAGCGA GACCCATCAA AGTGGAGA6C 1620 

OCTCCXTTGG AAGAGTGGCC CTCCCCGGCC CCATCTTTCA AAGAGGAATC ATCTCACTCC 1680 

TGGGAGGATT CGTCCCAATC TCCCACCCCA AGACCCAAGA AGTCCTACAG TGGGCTTAGG 1740 

TCCCCAACCC GGTGTGTCTC GGAAATGCTT GTGATTCAAC ACAGGGAGAG GAGGGAGAGG 1800 

AGCXXJGTCTC GGAGGAAACA GCATCTACTG CCSCCCtGrTG TGGATGAGCC GGAGCTGCTC 1860 

TTCTCAGAGG GGCCCAGTAC TTCCGGCTGG 6C0GCAGAGC TCCCGTTCCX: AGCAGACTOC 1920 

TCTGACCCTG CCTCCCAGCT CAGCTACTCC CAGGAAGTGG GAGGACCTTT TAAGACACCC 1980 

ATTAAGGAAA CGCTGCCCAT CTCCTCCACC COGAGCAAAT CTGTCCTCCC CAGAACCCCT 2040 

GAATCCTGGA GGCTCAOGCC CCCAGCCAAA GTAGGGGGAC TGGATTTCAG CCCAGTACAA 2100 

ACCrCCCAGG GTCCCTCTGA CXrCTTGCCT GACOCGCTGG GGCTGATGGA TCTCAGCACC 2160 

ACTCCCTTGC AAAGTGCTCC CCCCCTTGAA TCAOCXXaAA GGCTCCTCAG TTCAGAACCC 2220 

TTAGACCTCA TCTCCGTCXX: CTTTGGCAAC TCTTCTCCCT CAGATATAGA CGTCCCCAAG 2280 

CCAGGCTCCC OGGAGCCACA GGTTTCTCGC CTTGCAGOCA ATCGTTCTCT GACAGAAGGC 2340 

CTGGTCCTGG ACACAATGAA TGACAGCCTC AGCAAGATCC TGCTGGACAT CAGCTTTCCT 2400 

GGCCXGQAGG AGGACCCACT GGGCCCTGAC AACATCAACT GGTCCCAGTT TATTCCTGAG 2460 

CTACAGTAGA GCCCTGOCCT TGCCCCTGTG CTCAAGCTGT CCACCATCCC GGGCACTCCA 2 520 

AGGCTCAGTG CACCCCAAGC CTCTGAGTGA GGACAGCAGG CAGGGACTGT TCTGCTCCTC 2580 

ATAGCTCCCT GCTGCCTGAT TATGCAAAAG TAGCAGTCAC ACOCTACCX» CTGCTGGGAC 2640 

CTTGTGTTCC CCAAGAGTAT CTGATTCCTC TGCT G TOCCT GCCAGGAGCT GAAGGGTGG6 2700 

AACAACAAAG GCAATGGTGA AAAGAGATTA GGAACCCCCC AGCCTGTTTC CArTCTCTGC 2760 

CCAGCAGTCT CTTACCTTCC CTGATCTTTG CAGGGTGGTC CGTGTAAATA GTATAAATTC 2820 

TCCAAATTAT CCTCXAATTA TAAATGTAAG CTTATTTOCT TAGATCATTA TCCAGAGACT 2880 

GCCAGAAGGT GGGTftGGATG ACCTGGGGTT TCRATTGACT TCTGTTCCTT GCTTTTAGTT 2940 

TTGATAGAAG GGAAGAGCTG CAiCTGCACGG TTTCTTCCAG GCTGAGGTAC CTGGATCTTG 3000 

GGTTCTTCAC TGCAGG6ACC CAGACAAGTG GATCTGCTTG CCAGAGTCCT TTTTGCCCCT 3060 

CCCTGCCACC TCCCaSTGTT TOCAAGTCAG CTTTCCTGCA AGAAGAAATC CTGGTTAAAA 3120 

AAGTCTTTTG TATTGGGTCA GGAGTTGAAT TTGGGGTGGG AGGATGGATG CAACT6AAGC 3180 

AGAGTGIGG6 TGCCCAGATG TGOGCTATTA GATGTTTCTC TGATAATGTC CCCAATCATA 3240 

CCAGGGAGAC TCGCATTGAC GAGAACTCAG GTGGAGGCTT GAGAAGGCCG AAAGGGCCCC 3300 

TGACCreCCT GGCTTCCTTA GCTTGQCCCT CAGCTTTGCA AAGAGCCACC CTAGGCCCCA 3360 

GCTGACOGCA TGGGTCTGAG CCAGCTTGAG AACACTAACT ACTCAATAAA AGCGAAGGTG 3420 
GACCKAAAAA AAAAAAAAAA AAAA 
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Seq ZD BO: 69 Protein seqaence: 
Protein Accession NP_068772.1 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



1 
I 

HKASP3RPIiI 
GIKIINKPTM 
LRPQTQTSYD 

YMAMIQFAIN 
ANGKVSFWTI 
LGARRKMKPL 
PKVIiIAEEGI 
SPPliEEWPSP 
RSSSIUIKQUL 
PIKETLPISS 
TTPLQSAPPL 
GLVIiDTMNDS 



11 
I 

LKRRRLPLPV 
PNTQWAIPN 
AKRTBVTLET 
SSDGUSSRSI 
STERKRMTLK 
HPSANRYLTL 
tiPRVSSYLVP 
APLSSAGP6K 
APSFKEESSH 
LPPCVDEPEIi 
TPSKSVliPRT 
ESPQRLLSSE 
LSKILIiDZSF 



NANZHSIITA 
LGPKPAASDV 
KQEMEEKENC 
DIYTWIEDHP 
DQVFKPLDPG 
IQPPVN^V 
BEKLLFGBGF 
SWEDSSQSPT 
USEGPSTSR 
PESWRLTPPA 
PLDLISVPFG 
PGLDEDPLCP 



31 

! 

EPKRSPAQQS 
LTAKGKESGS 
HLPRPPGAW: 
HLBQRQVKVB 
PYPKHIAKPG 
SPQLPBHLES 
LQPSVKVPLP 
SPtLPVQTlK 
PRPKKSYSGL 
WAAELPPPAD 
KVGGLDPSPV 
VSSPSDIDVP 
DNINWSQPIP 



SGPI3XPILIS 
EQKRETCADG 
EPSRPSASHQ 
WKNSIRHNIiS 
QQKRPNPELR 
LAASLMSSEL 
EBBIQPGEEM 
RSPTRCVSEM 
SSDPASQLSY 
QTSQGASDPL 
KPGSPBFQVS 
ELO 



51 
I 

AESHSCXFPA 
GQGAPTQPPG 
EAAGCTINSfS 
NSVSSRPPYS 
LHDMFVRETS 
RNMTIKTELP 
ARHSKRVRXA 
PHLAHPIKVE 
LVIQHRERRE 
SQEVG6PFKI 
PDPLGLMDLS 
GLAAMBSLTB 



Seq ID KO: 70 DNA sequence 

Nucleic Acid Accession U: BC006529.1 

ceding sequence: 178-2424 



1 
i 

GGCAC3GAGGG 
CCAGGTTGGA 
CAGTCTGGAG 
AAAACTAGCC 
AATGCCCCAA 
AATCAAGCAG 
ATCAAGATTA 
GCTAATATTC 
GGGCCCAACA 
CGGCCTCAAA 
GGACCAAAAC 
CAGAAA0GG6 
TCCAACATCC 
CAAGAGATGG 
CCTTGGAGAC 
ATGGCCATGA 
ATCTATAOGT 
AAGAACTCCA 
AATGGCAAGG 
CAGGTGTTTA 
ACCGAACTCC 
TACCTGGTAC 
GTGCCATTGC 
GTCOGCATTG 
GGACCAGGOA 
CA6ACTATCA 
ATCAAAGTGG 
GAATCATCTC 
TACAGTGGGC 
GAGAGGAGGG 
GAGCCGGAGC 
TTCCCAGCAG 
CCTTTTAAGA 
CTCCCCAGAA 
TTCAGCCCAG 
ATGGATCTCA 
CTCAGTTCAG 
ATA6A06T0C 
TCTCTGACAG 
GACATCAGCT 
CAGTTTATTC 
ATCCCGGGCA 
ACTGTTCTGC 
AGCCACTGCT 
GAGCTGAAGG 
GTTTCCATTC 
AAATAGTATA 
CATTATCCAG 
TCCTTGCTTT 
GGTACCTGGA 
GTCCTTTTTG 
AAATOCTGGT 
CGATGCAACT 
ATGTCCCCAA 
GGCC3GAAAGG 
GCACCCTAGG 
ATAAAAGCGA 



11 
I 

GGAOCCGGCC 
GGAGCCCGGA 
GGTCCACACT 
CCCGTCX3GCC 
GTOAAACATC 
AGQCCTCCAA 
TTAACCACCC 
ACAGCATCAT 
AATTCATCXrr 
CCCAAACCAG 
CTGCAGCTAG 
AGACXTTOTGC 
AGTGGCTTCG 
AGGAAAAGGA 
CATCAGCGTC 
TACAATTCGC 
GGATTGAGGA 
TCCGCCACAA 
TCTCCTTCTG 
AGGAGCAGAA 
CCCTGGGCX3C 
CTATCCAGTT 
CCCTGGCX3GC 
CCCCCAAGGT 
AAGAGGAGAA 
AGGAGGAAGA 
AGAGCCCTCC 
ACTCCTGGGA 
TTAGGTCCCC 
AGAGGAGCCG 
TGCTCTTCTC 
ACTCCTCTGA 
CACCCATTAA 
CCCCTGAATC 
TACAAACCCC 
GCACCACrCC 
AACCCTTAGA 
CCAAGCCAGG 
AAGGCCTGGT 
TTCCTGGCCT 
CTGAGCTACA 
CTCCAAGGCT 
TCCTCATAGC 
GGGACCTTGT 
GTGGGAACAA 
TCTGCCCAGC 
AATTCTCCAA 
AGACTGCCAG 
TAGTrrTGAT 
TCTTGGGTTC 
CCCCTCCCTG 
TAAAAAAGTC 
GAAGCAGAGT 
TCATACCAGG 
GCCCCTGACC 
CCCCAGCTGA 
AGGTGGAAAA 



41 

1 

CCGGGGCCCT 
CCTAACGGOG 
GAAAAOGCAG 
, GGCTGCCCCT 
CCCCTGCCCA 
CTTGCAAGTT 
I TAGrrCGCCAT 
\ GAAAAGAGAG 
CAACTCAGCC 
AAGTGACCCT 
: CACCTGGAGC 
' GCACTATCAA 



GAGACTGGCA 
TGCCTGGCTT ' 
CCGCATGGGT i 
AAAAAAAAAA ; 



SI 
1 

GGCT06GCCC 
GOGGGGACrrG 
ATTCATAATG 
TCCTGTTCAA 
ACAGGAGTCT 
TCCAGCTGGG 
CCCCAACAAT 
TG6CAGTAGT 
TCCAGGACTC 
GGAGACCTTG 
CCTTTGOGAG 
CAATAGCCTA 
CAGCATCAAG 
GGTTGAGGAG 
CTACTCTTAC 
TTT6AAAGAC 
GCCAGGCTGG 
GACGTCTGCC 
GACATTGGAC 
GACCATCAAA 
GGTCAGCTCA 
CTCGGTGAAG 
TAGCAAGCX^ 
TTCTTCTGCA 
GCTTCCAGTT 
AGCGAGA(XC 
TTTCAAAGAG 
CAAGAAGTCC 
TCAACACAGG 
CTGTGTGGAT 
AGAGCTCCCG 
AGTGGGAGGA 
CAAATCTGTC 
GGGACTGGAT 
CCTGGGGCTG 
GCAAAGGCTC 
TCCCTCAGAT 
AGCCAATCGT 
GATCCTGCTG 
CAACIGGTCC 
GCTGTCCACC 
GCAGGCAGGG 
GTCACACCCT 
TCCCTGCCAG 
CCCCCAGCCT 
T6GTCCGTGT 
TTCCTTAGAT 
TGACTTCTGT 
TCCAGGCTGA 
GCTTGCCAGA 
CTGGAAGAAG 
GTGGGAGGAT 
TTCTCTGATA 
GGCTTGAGAA 
TTGCAAAGAG 
TAACTACTCA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
108O 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 



Seq ID MO: 71 Protein sequence: 
Protein Accession i: AAH06529.1 
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PCTAJS02/12476 



1 11 21 31 41 51 

! 1 I 1 I I 

KECrSPRRPLI LiOUlRLPIiPV QNAPSETSEE BPKaSPACXJE SKQAEASKEV ASSNSCKFPA €0 

5 6IKIIKHP1M PNTQWAIPN NANIHSIITA LTAKGKESGS SGPKKFILIS CGGAPTQPPG 120 

LRPOTOTSYD AlOETEVTLET IiGPEa»AARDV NliPRPPGALC EQXRETCADG EAAGCTIKNS 180 

liSNIQWLRKM SSDGLGSRSX KQEMEEKBKC HIiBQRQVKVS SPSRPSASHQ NSVSERPPYS 240 

VTXAMIQPAIN STERKRMTLK DIYTWIEDHF PYFKHXAKPG WKNSIRBNIiS LHDMFVRETS 300 

AKGKVSFWTI HPSANRYI-TL DQVPKQQKRP NPELRRNMTI KTELPUSARR KMKPUiPRVS 360 

10 SYLVPIQPPV NQSLVIiQPSV KVPLPLAASL MSSELARHSK RVRIAPKVLL AEEGIAPLSS 420 

ACPGKEEKLL FGEGPSPLLP VQTIKEBEIQ PGEEKPHLAR PIKVESPPLE EWPSPAPSFK 480 

KESSKSWEDS SQSPTPRPKK SYSGLRSPTR CVSJSCVIQH SERRERSRSR RKQBLLPPCV S40 

DSPELLFSEG PSTSRHAAEZ. PFPAJ>SSDPA SQLSYSQEVG GPFKTPIKET IiPISSTPSKS 600 

VLPRTP&SWR LTPPAKVGGI* DPSPVQTPQG ASDPLTOPLG LMDLSTTPLQ SAPPLESPQR 660 

15 lASSEPLDLI SVPFGSSSPS DIDVPKPGSP BPQVSGLAAH RSLTBGLVU) 'TONDSLSKIL 720 
LDISFPGLDE DPLCPDSflNH SQFXPELQ 



Seq ID NO: 72 DKA sequence 
20 Kuclelc Acid Accession is U74612.a. 
boding sequence: 178-2583 

1 11 21 31 41 51 

25 GGCAOGASGG GGACCCGGCC GGTCQQGOGC GAGCCCCOST COGGGGCCCT GGCTOGGCCC 60 

CCAGGTT6GA GGAGCCCGGA GCCCGCCTTC GGAGCTACGG OCIAAOGGOG GCGGCGACTG 120 

CAGTCTGGAG GGTCCACACT TGTGATTCTC AATGGAGAGT GAAAAOGCAG ATTCATAATG 180 

AAAACTAGCC CCCGTOGGCC ACTGATTCTC AAAAGAOGGA GGCTGCCCCT TCCTGTTCAA- 240 

AATGCCCCAA GTGAAACATC AGAGGAGGAA CCTAAGAGAT CCCCTGCCCA ACAGGAXJTCT 300 

30 AATCAAGCAG AGGCCTCCAA GGAAGTGGCA GAGTCCAACT CTTGCAAGTT TCCAGCTGGG 360 

ATCAAGATTA TTAACCACOC CAOCATGCCC AACACGCAAG TACTGGCCAT CCCCAACAAT 420 

GCTAATATTC ACAGCATCAT CACAGCACTG ACTGCCAAGG GAAAAGAGAG TGGCAGTAGT 480 

GGGCCCAACA AATTCATCCT CATCAGCTGT GGGGGAGCCC CAACTCAGCC TCCAGGACTC 540 

CGGCCTCAAA CCCAAACCAG CTATGATGCC AAAAGGACAG AAGTOAOOCT GGAGAC CTTC 600 

35 GGACCAAAAC CTGCAGCTAG GGATGTGAAT CTTCCTAGAC CACCTGGAGC CCTTTGCGAG 660 

CAGAAACGGG AGACCTGTGC AGATGGTGAG GCAGCAGGCT GCACTATCAA CAATAGCCTA 720 

TCCAACATCC AGTGGCTTCG AAAGATGAGT TCTGATGGAC TGGGCTCCX33 CAGCATCAAG 780 

CAAGAGATGG AGGAAAAGGA GAATTGTCAC CTGGAGCAGC GACAGGTTAA GGTTGAGGAG 840 

CCTTCGAGAC CATCAGCGTC CTGGCAGAAC TCTGTGTCTG AGCGGCCACC CTACTCTTAC 900 

40 ATGGCCATCA TACAATTOSC CATCAACAGC ACT6AGAGGA AGOGCATGAC TTTGAAAGAC 960 

ATCTATACGT GGATTGAGGA CCACTTTCCC TACTTTAAGC ACATTGCCAA GCCAGGCTGG 1020 

AAGAACTCCA TCCGCCACAA CCTTTCCCTG CACGACATGT TTGTCCGGGA GACGTCTGCC 1080 

AATGGCAAGG TCTCCTTCTG GACCATTCAC CCCAGTGOCA ACOGCTACTT GACATTGGAC 1140 

CAGGTGTTTA AGCCACTGGA CCCAGGGTCT CCACAATTGC COGAGCACTT GGAATCACAG 1200 

45 CAGAAAGGAC CQAATCCAGA GCTCCGCCGG AACATGACCA TCAAAACOGA ACTCCCCCTG 1260 

GGCGCACGGC GQAAGATGAA GCCACTGCTA CX3U3GGGTCA GCTCATACCT GGTACCTATC 1320 

CAGTTCCCGG TGAACCAGTC ACTGGTGTTG CAGCCCTGGG TGAAGGTGGC ATTGCCCCTG 1380 

GCGGCTTCXX: TCATGAGCTC AGAGCTTGCC GGCCATAGCA AGOGAGTCOG CATTGCCCCC 1440 

AAGGTTTTTG GGGAACAGGT GGTGTTTGGT TACATGAGTA AGTTCTTTAG TGGCGATCTG 1500 

50 CGAGATTTTG GTACACCCAT CACCAGCTTG TTTAATTTTA TCTTTCTTTO TTTATCAGTG 1560 

CTGCTAGCTG AGGAGGGGAT AGCTCCTCTT TCTTCTGCAG 6ACCAGGGAA AQAGGAGAAA 1620 

CTCCTGTTTG GAGAAGGGTT TTCTCCTTTG CTTCCAGTTC AGACTATCAA GGAGGAAGAA 1680 

ATCCAGCCTG GQGAGGAAAT GCCACACTTA GOSAGAOOCA TCAAAGTGGA GAGCCCTCCC 1740 

TTG6AAGAGT GGCCCTCCCC GGCCCCATCT TTCAAAGAGG AATCATCTCA CTCCTGGGAG 1800 

55 GATTOSTCCC AATCTCCCAC CXX:AAGACCC AAGAAGTCCT ACAGTGGGCT TAGGTCCCCA 1860 

ACCCGGTGTG TCTCGGAAAT GCTTGTGATT CAACACAGGG AGAGGAGGGA GAGGAGCCGG 1920 

TCTCGGAGGA AACAGCATCT ACTGCCTCCC TGTGTGGATG AGCCGGAGCT GCTCTTCTCA 1980 

GAGGGGCCCA GTACTTCCCG CTGGGCCGCA GAGCTCCCGT TCCCAGCAGA CTCCTCTGAC 2040 

CCTGCCTCCC AGCTCAGCTA CTCCCAGGAA GTGGGAGGAC CTTTTAAGAC ACCCATTAAG 2100 

60 GAAACGCTGC CCATCTCCTC CACCCGGAGC AAATCTGTCC TCCCCAGAAC CCCTGAATCC 2160 

TGGAGGCTCA OGCCCCCAGC CAAAGTAGGG GGACTGGATT TCAGCOCAGT ACAAACCTCC 2220 

CAGGGTGCCT CTGACCCCTT GCCTGACCCC CTGGGGCTGA TGQATCTCAG CACCACTCCC 2280 

TTGCAAAGTG CTCCCCCCCT TGAATCACCG CAAAGGCTCC TCAGTTCAGA ACCCTTAGAC 2340 

CTCATCTCCG TCCCCTTTGG CAACTCTTCT CCCTCAGATA TAGACGTCCC CAAGCCAGGC 2400 

65 TCCCXXX3AGC CACAG6TTTC TGGCCTTGCA GCCAATCGTT CTCTGACAGA AG GQCTG GTC 2460 

CTGGACACAA TGAATGACAG CCTCAGCAAG ATCCTGCTGG ACATCAGCTT TOCTGGCCTG 2520 

GACGAGGACC CACTGGGCCC TGACAACATC AACTGGTCCC AGTTTATTCC TGAGCTACAG 2580 

TAGAGCCCTG COCTTCCCCC TGTGCTCAAG CTGTCCACCA TCXXX3GGCAC TCCAAGGCTC 2640 

AGTGCACCCC AAGCCTCTGA GTGAGGACAG CAGGCAGGGA CTGTTCTGCT CCTCATAGCT 2700 

70 CCCroCTGCC TCATTATGCA AAAGTAGCAG TCACACCCTA GCCACTGCTG GGACCTTGTG 2760 

TTCCCCAAGA GTATCTGATT CCTCTGCTGT CCCTGCCAGG AGCTGAAGGG TGGGAACAAC 2820 

AAAGGCAATG GTGAAAAGAG ATTAGGAACC CCXCAGCCTG TTTCCATTCT CTGCCCAGCA 2880 

GTCTCTTACC TTCCCTOATC TTTGCAGGGT GGTCC3GTGTA AATAGTATAA ATTCTCCAAA 2940 

TTATCCTCTA ATTATAAATG TAAGCTTATT TCCTTAGATC ATTATCCAGA GACTGCCAGA 3000 

75 AGGTGGGTAG GATGACCTGG GGTTTCAATT GACTTCTGTT CCTTGCTTTT AGTTTTGATA 3060 

GAAGGGAAGA CCTGCAGTGC AOGGrTTCTT CCAGGCTGAG GTACCTGGAT CTTGGGTTCT 3120 

TCACTGCAGG GACCCAGACA AGTGGATCTQ CTTGCCAGA6 TCCTTTTTGC CCCTCCCTGC 3180 

CACCTCOCCG TGTTTCCAAG TCAGCTTTCC TGCAAGAAGA AATCCTGGTT AAAAAAGTCT 3240 

TTTGTATTGG GTCAGGAGTT GAATTTGGGG TGGGAGGATG GATGCAACTG AAGCAGAGTG 3300 

80 TCGGTGCCCA GATGTGCGCT ATTAGATGTT TCTCTGATAA TGTCCOCAAT CATACCAGGG 3360 

AGACTGGCAT TCAOGAGAAC TCAGGTGGAG GCTTGAGAAG GCCGAAAGGG CCCCTGACCT 3420 

GCCTGGCTTC CTTAGCTTGC CCCPCAGCTT TGCAAAGAGC CACCCTAGGC CCCAGCTGAC 3480 

CGCATGGGTG TC3AGCCAGCT T6AGAACACT AACTACTCAA TAAAAOCGAA GGTGGACAAA 3540 
AAAAAAAAAA AAAAA 

85 

Seq ID NO: 73 Protein sequence: 
Protein Accession ft: AAC51128.1 
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1 11 21 31 41 51 

t I ) i I i 

MKTSPREUPLI LKERRLPLPV QNAPSBTSKB BPKRSPAQQE SNQABASKBV ABSNSCKFPA 
GZKIZNKPTK PNTQWAIPN HANIHSIITA LTAKGKESGS SGPNKFILIS CGGAPTQPPG 
LRPOTOrSYD AKRTBVTLET U3PKPAARDV NLPRPPGALC BQKRETCADG EAAGCTIUNS 
LSNIQWLRKM SSDGLGSRSI KQEMBBKEKC HLEQHQVKVE EPSRPSASMQ NSVSERPPYS 
YMAMIQFAIN STERRRTfTLK OIYTWIEDHP PYFICKZAKPG NKHSXBBNIiS LHDMPVRETS 
AKGKVSFWTI HPSANRYLTL DQVFKPLOPG SFQLPEHLES QQKHPNPELR RKMTIKTELP 
LGARRKMKPL LPRVSSYLVP IQFPVNQSLV WJPSVKVPLP lAASUaSSEL AHHSKRVRIA 
PKVFGEQWF GYMSKFFSCS) LRDFGTPITS LFKPIFLCLS VUAEEGIAP LSSAGPGKEE 
iCLXj£Y3BGFSP LLPVQTIKEE BIQPGEEMFH LARPIKVES? PLEEWPSPAP SFKEESSHSW 
EDSSQSPTPR PKKSYSGLRS PTRCVSadV XOHROtRSftS RSRHXQHLLP PCVDEPELLF 
SBGPSTSRWA AELPFPADSS DPASQLSYSQ EVGGPFKTPI KETLPISSTP SKSVLPRTPE 
SWRLTPPAKV GGLDPSPVQT SQGASDPLPD PLGLHDLSTT PIflSAPPLES PQRLLSSEPL 
DLISVPFGNS SPSDIDVPKP GSPEPQVSGIi AANRSLTBGl* VUSTKHDSIiS KXLLDISFPG 
LDSDPLGPDN UWSQPIPEL Q 

Seq ID NO: 74 DKA sequence 

Nucleic Acid Accession S: Eos sequence 

Coding sequence: 111-416 

1 11 21 31 41 51 

I 1 1 I 1 1 

GGGAAGAGCC AGGCTGAGCC TTATAAAGGA CTGCTCTrTG TCCAAACACA CACATCTCAC 
TCATCCTTCT ACTt3GT6AOG CXTOCCROCT CTGGCTTTrr GAAAGCAAAG ATGAGCAACA 
CTCAAGCTGA GAGGTOCATA ATAGGCATGA TOGACATGTT TCACAAATAC ACCAGACGTG 
ATGACAA6AT TGAGAAGCCA AGCCTGCTGA OGATGATGAA GGAGAACTTC CCCAACTTCC 
TTAGTGCCTG TGACAAAAAG GGCACAAATT ACCTCX5CCGA TGTCTTTGAG AAAAAGGACA 
AGAATGAGGA TAAGAAGATT GATTTTTCTG AGTTTCTGTC CTTGCTGGGA GACATAGCCA 
CAGACTACEA CAAGCAGAGC CATGGAGCAG CGCCCTGTTC CXWGGGCAGC CAGTGACCCA 
GCCCCACCAA TGGGCCTOCA GAGACOCCAG GAACAATAAA ATGTCTTCTC CCACX»GA 

Seq ID NO: 75 Protein sequence: 
Protein Accesaion #: Eoe sequence 

1 11 21 31 41 SI 

] I I I I 1 

MSNtCAERSI IGMIDMFHKY THRDDKIEKP SLLTMMKENP PNFLSACDKK GTNVIiADVFE 
KKDKNEDKKI DPSEFLSLtiG DIATDYHKQS HGAAPCSGGS Q 



Seq ID NO: 76 DNA sequence 

Nucleic Acid Accession Us Eos sequence 

Coding sequence: 111-416 



1 11 21 

1 I 1 

GGGAAGAGCC AGGCTGAGCC TTATAAAGGA 
TCATCCTTCT ACTOGTGACA CTTCCCAGTT 
CTCAAGCTGA GAGGTCCATA ATAGGCATGA 
ATGGCAAGAT TGAGAAGCCA AGCCTGCTGA 
TCAGTGCCTG TGACAAAAAG GGCATACATT 
AGAATGAGGA TAAGAAGATT GATTTTTCTG 
CAGACTACCA CAAGCAGAGC CATGGAGOGG 
GCCCCACCAA GGQGCCTCCA GAGACCCCAG 

Seq ID NO: 77 Protein sequence: 
Protein Accession ft: XP_04ai24.l 

1 11 21 

I I 1 

MSNTQAERSI IGMIDMFHKY TGRDGKIEKP 
KKDKNEDKKI DFSEPLSIiLG DIAADYHKQS 



31 41 SI 

I I I 

CTGCTCTTTG TCCAAACACA CACATCTCAC 
CrGGCTTTTT GAAAGCAAAG ATGAGCAACA 
TOGACATGTT TCACAAATAC ACCGGACGTG 
GGATGATGAA GGAGAACTTC CCCAATTTCC 
ACCTCGCCAC TGTCTTTGAG AAAAAGGACA 
AGTTTCTGTC CTTGCTGGGA GACATAGCOG 
CGCCCTGTTC TGGGGGAAGC CAGTGATCCA 
GAACAATAAG TGTCTCCTCC CACCAGA 



31 41 51 

1 I i 

SLLTMMfCENF PNFbSACDKK GZHYLATVPE 
HGAAPCSGGS Q 



Seq ID NO: 78 SNA sequence 
Nucleic Acid Accession Z7367a.l 
Coding sequence: 253-2433 

1 11 21 31 41 51 

) I I I i 1 

GGGGTGGTGC AGGGCAGGGG TGGTATATCC TGTCTGAOGG A6GGCGGGCC TGGGCAGTGC 
CAGAGAGGGA C6AACCAGGG TGGAAGOGCC AGGAGCAGCT GCAGGGAGCC CTCACGCGGA 
CCTCGCACTC TATGGCCGTA GGGAGCOGCT GAGAGCGAGA AGAGCACGCT CCTGCCCGCC 
CGCTGCACCG CACCTCGCCT CGCCTCTCTG CTCTCCTAGG CCCCGGCCGC GCGCCACCCG 
CCTCCCGCCA CCATGAACCA CTCGCCGCTC AAGACCGCCT TGGCGTACGA ATGCTTCCAG 
GACCAGGACA ACTCCAOGTT GGCTTTGGCG TCGGACCAAA AGATGAAAAC AGGCACGTCT 
GGCAGGCAGC G06TGCAGGA 6CAGGTGATG ATGACCGTCA AGCGGCAGAA GTCCAAGTCT 
TCCCAGTCGT CCACCCTGAG CCACTCCAAT OGAGGTTCCA TGTATGATGG CTTGGCTGAC 
AATTACAACT ATGGGACCAC CAGCAGGAGC AGCTACTACT CCftAGTTOCA QGCAGGGAAT 
GGCTCATGGG GATATCCGAT CTACAATGGA ACCCTCAAGC GGGAGCCTGA CAACAOGCGC 
TTCAGCTCCT ACAGCCAGAT GGAGAACTGG AGCCGGCACT ACCCCCGGGG CAGCTGTAAC 
ACCACCGGCG CAGGCAGCGA CATCTGCTTC ATGCAGAAAA TCAAGGCGAG COGCAGTGAG 
CC06ACCTCT ACTGTGACCC ACGGGGCACC CTGCGCAAGG GCAOGCTGGG CAGCAAGGGC 
CAGAAGACCA CCCAGAACOG CTACAGCTTT TACAGCACCT GCAGTGGTCA GAAGGCCATA 
AAGAAGTGCC CTGTGGGGCC GCCCTCTTGT QCCTCCAAGC AGGACCCPGT GTATATCOCG 
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CCCATCTCCr GCAACAAG6A LXT i ir i ' UCTT T OGCCACTCTA GGGOC AGCTC CAAGATCTGC 960 

AGTGAGGACA TOaCTGCAG TGGtSCTCACC ATCCCCAAGG CTGmGCAGTA CCTGAGCTCC 1020 

CAGGATGAGA AGTACCACGC CATTGGGGCC TATTACATCC AGCATACCTG CTTCCAGGAT 1080 
GAATCTGCCA AGCAACAGGT CTATCAGCTC GGAGGCATCT GCSWUXTCGGT GGACCTCCTC ' 1140 

OGCAGCCCCA ACCAtSUbOGT CCAGCaGGCC GCX3GCAGGGG OOCTGOGCAA CCTGGTGTTC 1200 

AGGAGCACCA CCAACAAGCT GGAGACCCGG AGGCAGAATG GGATCCGOGA GGCAGTC3VGC 1260 

CTCCTGAGGA GAAC0C3GGAA CGCCGAGATC CAGAAGCAGC TGACTgGCT GCTCT GGAAC 1320 

CTGTCTTCCA CTGAOGAGCT GAAGGAGGAA CTCATTGCOa ACX30CCTG0C TGTTCrGGCC 1380 

GACCGCGTCA TCATTCCCTT CTCTGGCTGG TGCGATGGCA ATAGCAACAT GTOCOGGGAA 1440 

GTGGTGGACC CTGAGGTCTT CTTCAATGCC ACAGGCTGCT TGAGGAACCT GAGCTCGGCC ISOO 

GATGCAGGCC GCCAGACCAT GOGTAACTAC TCAGGGCTCA TTGATTCCCT CATGGCCTAT 1560 

GTCCAGAACT GTCTACSOSGC CAGCCGCTGT GACGACAAGT CTCTGGAAAA CPGCATGTGT 1620 

GTTCTGCACA ACCTCTCCTA CCGCCTGGAC GCCGAGGTGC GCAOOOGCTA GOGGCAGCTG 1680 

GAGTATAAOG CCOGCAACGC CTACACOGAG AAGTCCTCCA CTOGCTGCTT CAGCAACAAG 1740 

AGCGACAAGA TGATGAACAA CAACTATGAC TGCCOXTGC CTGAGGAAGA GAOCAACCCC 1800 

AAGGGCAGCG GCTGGTTGTA CCATTCAGAT GCCATCCGCA CCTACCTGAA CCTCATGGGC I860 

AAGAGCAAGA AAGATGCTAC CCTGGAGGCC TGTGCTGCTG CCCTGCAGAA CCTGACAGCC 1920 

AGCAAGGGGC TGATGTCCAG TGGCATGAGC CAGTTGATTG GGCTGAAGGA AAAGGGCCTG 1980 

CCACAAATTG CCOGCCTCCT GCAATCTGGC AACTCTGATG TGGTGGGGTC OGGAGCCTCC 2040 

CTCCTGAGCA ACATGTCCCG CCACCCTCTG CTGCACAGAG TGATC6GGAA CCAGGTGTTC 2100 

CCGGAGGTGA CXaVGGCTCCT CACCAGCCAC ACTGGCAATA OCMXAACTC OGAAGACATC 2160 

TTGTCCTCGG CCTtSCTACAC TGTGAGGAAC CXGATGGCCT CX5CAGCCACA ACTGGCCAAG 2220 

CAGTACTTCT CCAGCAGCAT GCTCAACAAC ATCATCAACC TGTGCOGAAG CAGTGCCTCA 2280 

CCCAAGGCCG CAGAAGCTGC CCGGCTTCTC CTGTCTGACA TGTGGTCCAG CAAGGAACTG 2340 

CAGGGTGTCC TCAGACAGCA AGGTTTCGAT AGGAACATGC TGGGAAOCTT AGCTG6GGCC 2400 

AACAGCCTCA GGAACTTCAC CTOCOGATTC TAAGAAGAGA CTGTCCAAGC A AGTTA GGCT 2460 

TCCAGGAAGA TATGACCCAG CTGA6AAG0C CTCAQGCCTC GCTGGATGGG GTTTTCTGTC 2S20 

CATCCTGTGC AGTATTTGGG AAAGTTCACA AGAAACTGAG AAGAAAOCTA AAAACTGTGG 2S80 

ATAGTGGAAA GATTTTTAGA TTTTTTTTTT CCTTGGGGAA ACPGGCS^GGC AATGGGGGTT 2640 

AGGGAGGTTG GGGOGGGGGG GGCTTTCTTG AGTTAAAGGG GCTTATATGT GATGTCAATA 2700 

TTTCTTCCTC TGAGAAATGG TATATATATC TGTCTAATGT AAGTGTGTGC ATGCATGTGC 2760 

GCGTCCATGT GTGTGTOTGT GAGTGTCTTA AAGCATAACC ACAAACTGCA AAAAGCTAGG 2820 

TAAGCTATTT TGTTGCAGCT CATAAGGTGG TGAAAAGGAC TCTCCTGTGT TTCTT ACTC A 2880 

TAGGCAAGC3A CAACATGTGC TTTTTGGTGA GCTGCTCATA ATTCCTOAAA TGTGTGGTGC 2940 

CAGGGCAAGG GGGCCATCAC TGCAGTCAGG CCCTCAGAGG AGTCCTGCAG GCTTCCTACC 3000 

AGTGGTCTCC AAGGGTGCAG GAGTAACTGG GGCTGGGCCA GCCTCCCCCC TTACAAGGCT 3060 

GCTTTCCACG AAGGGAGGTC TGGTGTATCT CATGGGAGAA TCTGGGGTGT CTGTAGTGTC 3120 

ACCCCTCCAG CAGCGCCACA AGGACTGAGG TTGGGTAGGT GT6AGGTTCC AGAGGACAGC 3180 

AGGACACTCT OGCATACTTT GCCAAAT6AG GCCXGCTCAG AGGAGTAGGA GCTGAAAGAT 3240 

GGTGCCTTCC ACCCTCTTGG GCTGTGTGCC CATCAGAGCA GGCTCAGCCT GCAAA GGCCC 3300 

TGCATTCAGA GGTCTTGTAA TCTACTTGTT GCAGGAGAAA GAAGGTAAAA AATGATTTTT 3360 

TTAAGAAAAG CTATTTTATT GCAGCTCTTT CCCAAGAGCT GTTCTGGGAA TGGCTGGTCT 3420 

TCATATTCCC AGTGGAGAGG GGAACAAGTG GGGCTGGGCA TATACCTATT CCGGCTTCTA 3480 

GTGGGATGGA GTTGQGGTAT AGAAATTAAC CAGGAAGATG TTTCCACCAA GCCTGCTGTG 3540 

AGTCAATTGA GGGAGTGTTT GGGTCCCAGG AGACTTGGAC GGGGGGAGTT TGGGTAGACT 3600 

AGGAAAGGAA AGTGCCATAT CAGGGTACCG GTACX3GGCAA GCTCACATCT CAGCCAGGGG 3660 

CCATGCCCCA CTTCCCCTGA CCCCAGCTGT CTTGTCTCCA CTCTGTGAAA CCCACAGGGG 3720 

ATGTGATAAA CAGGGCTATT AGGGGTATCA GCCACGTCGA GCCCCCAGAC TCTGTGCACT 3780 

TCAGACCAGC AGCAGCAGGA GGGCTCCCGA GGGCCTTATG AGAAAACCTG TGTGGACATC 3840 

CCTTGGTGTA CACTAAGACA GAGCAGAGCC CAGCXSCTCCC AAGCCTTCCT CCTTCCAGCT 3900 

TCTACCTCCA TGCTAGCATT GCTGGTGTTA GAGAGGAATT AACTTCCTGG TCTGTGCCCT 3960 

TCTCTAGAAG AATATAAGAT GCTCCTCCTC CTCAOCCCTT CTCftGCCTCC TCCCAAGTCT 4020 

TOCTCTTCTG CACCACCCCC GAGTCCAAAC CCACCTCTT6 CCCCAGCATT CAGGCTGGAA 4080 

AACACTCATG TCGACTCAGT AT6ACAACTG AGATGGGGGA AGCCAGACAT GTGAGGACGC 4140 

TGTCCTCCGA GAGGTGTCCC CGGCTGTTAG CCAGCTGTGC TGTGGTGCTG TGGGTCTGTC 4200 

ATACCCTCCC TTGCTTCTGT TCACACTGGG AGGCCXa^CTC CTGGCTCACC TCTCCCTCTC 4260 

AGGGACCCAC GTQGGAGCCT GGATCCCTGG ACTGTCCTGG GCATAGGTTT CAGGGGCCTC 4320 

CTTTGTTGTC ATCAGAACCC AGAGGAATTC TTCTCCTAAA AAATACGTAT GGCATACCAA 4380 

TCTGTGCGGG GCAGTGTCCT AAGCACTTAG ACTACATCAG GGAAGAACAC AGACCACATC 4440 

CCCGTCCTCA TGCGGCTTAT GTTTTCTGGA GGAAAGTGGA GACACAAGTC CTTGGCTTTA 4500 

GGGCTCCCCC GGCTGGGGGC TGTGCAGTCC GGTCAGGGCG GGAGGGGAAA TGCACOGCTG 4560 

CATGTGAACC TTACCAGCCC AGGCGGATGC CCCTTCCCCT TAGCACTACC CTGGCCTCCT 4620 

GCATCCCCTC GCXTCATGTT CCTCCCACCT TCAAAGAATG AAGAGCCCCA TGGGCCCAGC 4680 

CCCTGCCCTG GGAACCAGGC AGCCTTCCAG ACCTCAGGGG CTGAGGCAGA CTATTAGGGC 4740 

AGGGCTGACT TTOGTGACaU: TGCCCATTOC CTCTCAGGCC AGCTCAGGTC ACCOGtSGCCT 4800 

CTCACCCAGG CCTGTCACTT TGAGAQGGGC AAAACTGA6A GGGGCTTTTC CTAGAGAAAG 4860 

AGAACAAGGA GCTTGCCAGG CTTCATGTAG CCGACACAOG TCTCAGGATT TTAAGTCCAC 4920 

ATTGGCCTCA CACTAGCCTA GGCCAATGCC CAAAATAAGG AGTTCCAATT TGGGGCCAAA 4980 

TGAGGAAGGA CACAGACTCT GCCCTGGGAT CTCCTGTGCT AGOGGCCAAT GACAAATCCA 5040 

GTCATTGGCC A0CA6CCACC TCTGCAGTGG GGACCACACT AGCAGCCCTG ACTCCACACT SlOO 

CCrCCTGGGG ACCCAAGAGG CA6TGTTGCT GTCTGOGTGT CCACCTTGGA ATCTGGCTGA 5160 

ACTCGCTGGG AG6ACCAAGA CTGCGGCTGG GGTGGGCAGG GAAGG6AAGC OGGGGGCTGC 5220 

TGTCAGGGAT CTTGGAGCTT CCCTGTAGCC CACCTTCCCC TTGCTTCATG TTTGTAGAGG S2B0 

AACCTTGTGC CGGCCAGGCC CAGTTTCCTT GTeiQATACA CTAATeTATT TGCTTTTTTT 5340 
GGAAATAGAG AAAATCAATA AATTGCTAGT GTTTCTTTGA AAAAAAAAA 

Geq ID NO: 79 Protein sequence: 
Protein Accession fix CAA98022.1 

1 11 21 31 41 51 

i 1 I i I I 

KNHSPLKTAL AYECFQDQIJN STLALPSDQK MKTGTSGRQR VQEQVMMTVK RQKSKSSQSS 60 

TLSHSNRGSM YDGLADNYNY GTTSRSSYYS KFQAGNGSWG YPIYNGTl^ EPDNHflFSSY 120 

SQMENHSSHY PRGSCNTTQA GSDICFMQKI KASRSEPDLY CDPRGTLRKG TIX5SKGQKTT 180 

QNRYSPYSTC SGQKAIKKCP VRPPSCASKQ DPVYIPPISC NKDLSEGHSR ASSKICSEDI 240 

ECSGLTIPKA VQYLSSQDEK YQAIGAYYIQ HTCPQDBSAK QQVyQLGGIC KLVDLLRSW 300 

QNVQQAAAGA LSNLVPRSTT 17KLETRSQKG IREAVSIiLRR IXSIAEIQKQL T GL L WML SST 360 
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DBLKEELIAD AXoFVLADBVI ZPFSGgiCDG!7 SSMSREWDP EVFPSATGCL BKLSSADA6R 420 

QTMRNYSGLI DSLMAYVQSC VAASIK53DKS VENOtCVLHN I*SYHLDAKVP TRYR QLEYHR 480 

RNAYTBKSST GCFSNKSDKM MNNNYDCPIiP EECTIJPKGSG WI,YHSDAIHT YLNLfflGKSKK S40 

DATLE2VCAGA LQNLTASKGL MSSa4SQLX6 UKEKGLPQIA RLLQSGN'SDV VRSGASLLSK 600 

MSRHPLLHRV KGNQVFPBVT RltLTSBTOrT SNSEDILSSA CYTVaZiLMAS QPQLAKQYFS 650 

Sa4LNHIINL CRSSASPKAA EAAKLLLSaM WSSKBLQ6VL RQQGFOfaOCL GTIAGAKSLR 720 
HFTSRP 



Seq ID NO: 80 sequence 

Nucleic Acid Accession 8: 11M_0065X6.1 

Coding sequence: 180-1658 

1 11 21 31 41 51 

I 1 ) i ) i 

TAGTCX3CGCG TCCCCGAGTG AGCACGCCAG GGAGCAGGAG ACCAAACGAC GGGGGTOSGA 60 

GTCAGAGTCG CAGTGGGAGT CCCCGGACOG GflCCAOGAGC CTGAGOGGGA GAGOXXXSCT 120 

OGCACGCCOG TCGCCACCCX5 OGTACCCGGC GCAGCCAGAG CCACCAGCGC AGCGCTGCCA 180 

TGGAGCCCAG CAGCAAGAAG CTGAOGGGTC GOCTCATCCT GGCTGTGGGA GGAGCAGTGC 240 

TTGGCTCCCT GCAGTTTGGC TACAACACTG GAiGTCATCAA TGCCCCCCAG AAGGTGATOG 300 

AGGAGTTCTA CAACCAGACA TCGGTCCACC GCTATGGGGA GAGCATCCTG COCACCftOSC 360 

TCACCACGCr CTGGTCCCTC TCAGTGGOCA TCTTTTCTtST 'KSGGGGCATG ATTGGCTCCT 420 

TCTCTGTGGG CCTTTTOGTT AACCGCTTTG GCOCSGCGGAA TTCAATGCTG ATGATGAACC 4 BO 

TCCTGGCCTT CGTGTCOGCC GTGCTCATGG GCTTCTCGAA ACTGGGCAAG TCCTTTCTGA S40 

TCCTCATCCT GGOeXGCTPC ATCATGGGTG TCTACTGOSG CCTGAOCACA GGCTTOGTOC 600 

CCATCTATGT GGGTGAAGTG TCACCCACAO CCTTTOGIGG GGCCCTGGGC ACCCTGCACC .660 

AGCTCGGCAT CGTOGTOGGC ATCCTCATCX5 CCCAGGTGTT OGGCCTGGAC TCCATCATGG 720 

GCAACAAGGA CXTTGTGGCCC CTGCTGCTGA GCATCATCTT CATCCCGGCC CTGCPGCAGT 780 

GCATCGTGCT GCCCTTCTGC CCCGAGAGTC CCCGCTTCCT GCTCATCAAC CGCAACGAGG 340 

A6AACCGGGC CAAGAGTGTG CTAAAGAAGC TGGGGGGGAC AGCTGACGTG ACCCATGACC 900 

TCCAGGAGAT GAAGGAAGAG AGTCGGCAGA TGATGOGGGA GAAGAAGGTC ACCATCCTGG 960 

AGCTOTTOCXS CTCCCCOSOC TACOSCCAGC CX31TCCPCAT CGCTGTGGTG CTGCAGCTOT 1020 

CCCAGCAGCT GTCTGGCATC AACGCTGTCT TCTATTACTC CACGAGCATC TTOGAGAAGG 1080 

C3GGGGGTGCA GCAGCCTGTG TATGCCACCA TTGGCTCOGG TATCGTCAAC ACGGOCTTCA 1140 

CTGTOGTGTC GCTGTTTGTG GTGGAGCGAG CAGGCCGGCX3 GACCCTGCAC CTCATAGGCC 1200 

TOGCTGGCAT GGCGGGTTGT GCCATACTCA TGACCATCGC GCTAGCACTG CT GGAGC AGC 1260 

TACCCTGGAT GTCCTATCTG AGCATCGTGG CCATCTTTGG CTTTGTGGCC TTCTTTGAAG 1320 

TGGGTCCTGG COXATCCCA TGGTTCATCG TGGCTGAACT CTTCAGCCAG GGTCCACGTC 1380 

CAGCTGCCAT TGCOGTTGCA GGCTTCTCCa ACTGGACCTC AAATTTCATT GTG6GCATGT 1440 

GCTTCCAGTA TGTGGAGCAA CTGTGTGGTC CCTACGTCTT CATCATCTTC ACTGTGCTCC IS 00 

TCGTTCTGTT CTTCATCTTC ACCTACTTCA AAGTTCCTGA GACTAAAGGC CGGACCTTCG 1560 

ATGAGATCGC TTCCGGCTTC CGGCAGGGGG GAGCCAGCCA AAGTGATAAG ACACCCGAGG 1620 

AGCTGTTCCA TCCCCTGGGG GCTGATTCCC AAGTGTGAGT CGCCCCAGAT CACCAGCCCG 1680 

GCCTGCTCCC AGCAGCCCTA AGGATCTCTC AGGAGCACAG GCAGCTGGAT GAGACTTCCA 1740 

AACCTGACAG ATGTCAGCOG AGCCGGGCCT GGGGCTCCTT TCTCCAGCCA GCAATGATGT 1800 

CCAGAAGAAT- ATTCAGGACT TAACGGCTCC AGGATTTTAA CAAAAGCAAG ACTGT PGCTC 1860 

AAATCTATTC AGACAAGCAA CAGGTTTTAT AATTTTTTTA TTACTGATTT TGTTATTTTT 1920 

ATATCAGCCT GAGTCTCCTG TGCCCACATC CCAGGCTTCA CCCTGAATGG TTOCATGCCT 1980 

GAGGGTGGAG ACTAAGCCCT GTCGAGACAC TTGCCTTCTT CACCCAGCTA ATCTGTAGGG 2040 

CTGGACCTAT GTCCTAAGGA CACACTAATC GAACTATGAA CTACAAAGCT TCTATCCCAG 2100 

GAGGTGGCTA TGGCCACCCG TTCTGCTGGC CTGGATCTCC CCACTCTAGG G GTCA GGCTC 2160 

CATTAiGGATT TGCCCCTTCC CATCTCTTCC TACOCAACCA CTCAAATTAA TCTTTCTTTA 2220 

CCTGAGACCA GTTGGGAGCA CTGGAGTGCA GGGAGQAGAG GGGAAGGGCC AGTCTGGGCT 2280 

GCCGGGTTCT AGTCTCCTTT GCACTGAGGG CCACACTATT ACCATGAGAA GAGGGCCTGT 2340 

GGGAGCCTGC AAACTCACTG CTCAAGAAGA CATGGAGACT CCTGCCCTGT TGTGTATAGA 2400 

TGCAAGATAT TTATATATAT TTTTGGTTGT CAATATTAAA TACAGACACT AAGTTATAGT 2460 

ATATCTGGAC AAGCCAACTT GTAAATACAC CAOCTCACTC CTGTTACTTA CCTAAACAGA 2520 

TATAAATGGC TGGTTTTTAG AAACATGGTT TTGAAATGCT TGTGGATTGA GGGTAGGAGG 2580 

TTTGGATGGG AGTGAGACAG AAGTAAGTGG GGTTGCAACC ACTGCAACGG CTTAGACTTC 2640 

GACTCAGGAT CCAGTCCCTT ACACGTACCT CTCATCAGTG TCCTCTTGCT CAAAAATCTG 2700 

TTTGATCCCT GTTACCCAGA GAATATATAC ATTCTTTATC TTGACATTCA AGGCATTTCT 2760 

ATCACATATT TGATAGTTGG TGTTCAAAAA AACACTAGTT TTGTGCCAGC CGTOATGCTC 2820 
AGGCTTGAAA TG6CATTATT TTC3AATGTGA AG6GAA 

Seq ID NO: 81 Protein sequences 
Protein Accession NP_006S07.l 

1 11 21 31 41 SI 

I I I I I I 

NEPSSKXLTG RU4LAVGGAV LGSLQF6YMT GVIKAPQRVI EBFYNQTHVH RYGBSILPTT 60 

LTTLWSLSVA IPSVGGMIGS FSVGLPVNRF GRRN»ILMMK LLAPVSAVLM GPSKLGKSFE 120 

MI.ILGRPIIG VYCGLTTGPV PMYVGEVSPT AFRGALGTtiH QLGIWGILI AQVFGLPSIM 180 

GNKDLWPLUL SIIPIPALLQ CIVLPPCPES PRFLLrNRNE ENRAKSVLKK LRGTAUVTHD 240 

LQEMKEESRQ MMREKKVTUi ELPRSPAYRQ PILIAWLQL SQQLSGINAV FYYSTSIFEK 300 

AGVQQPVYAT IGSGIVNTAP TWSLFWER AGRRTLHLIG LAGMAGCAIL MTIALALLEQ 360 

I.PWMSYLSIV AIFGFVAFFE VGPGPIPWFI VAELFSQGPR PAAIAVAGFS NWTSNFIVGM 420 

CFQYVEQLOG PYVFIIFTVL LVLFPIPTYP KVPETKGRTF DEIASGPROG GASQSDKTPE 480 
ELPBPLGADS QV 

Seq ID NO: 82 DMA sequence 
Nucleic Acid Accession S i BC001291 
Coding sequence: 44-541 

1 11 21 31 41 51 

I 1 I ! ) I 

GGGGGCGCOG OSCGCTGACC CTCCCTGGGC ACCGCTGGGG AOGATGCOGC TGCTCGCCTT 60 

GCTGCTGGTC GTGGCCCTAC CGCGGGTGTG GACAGACGCC AACCTGACTG CGAGACAACG 120 
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AGATOCAGA6 GACTCC Cft GC GAAOGGAOGA 
TGAGAGAGAA AACACTTTCG AGTGCCAGAA 
CTGOGTTATA GOGGCCGTGA AAATATTTCC 
OGCTGGTTGT GCAGOGATCG AGAG ACCCAA 
GCCCATGCCC TTCTTTTACC TCAACTC3TTG 
AOCTATCAAC TCATCAGTGT TCAAAGAATA 
GCTGTG GC TG GCCATCCTCC TGCTG CTGGC 
AGCCAOGGGA CTGCCAOUjA CTGAGCCTTC 
ACCTGTTGCA TTAAACTTGT TTTCTGTTGA 
GGGATGGGAG AGTGGGGATC AGGTGCAGTT 
ACATTCACAG GAAGTCCAGA TCTCCTGAGT 
AAATCAAAOC TTGTAACTCA TTTATTGCTG 
OCTCTGAOGG CTTCAGTATT 6ATGG0GAGG 
TGCTGAGATG CTTCOGACCT TTCAa?K3AC 
GGGTX3AAGAC ATCCCTGGAG TGAAGGACTC 
AGGGCTGCCC CCATTCCAGT GGTOGAGtSOG 
CTAOCAGATT CCAGGAGGCA GAAGATAACT 
AiCCAGCTGOC ACAGGTGCAC AGATTCATAA 
ACTTAGGCCA AGTAGAGAGC ATCAGGGTAA 
CATCCATGGG GAGCTGAGAA ATCAGACTCA 
TTCAAAAGTT CAOGAAAAAA AAAAAAAAAA 

Seq ID HO: 83 Protein sequence: 
Protein Accession 8: AAH01291 



PCTAJS02/12476 



GGGTGACAAT 
0CCAA06AGG 
AOGrrTTTTC 
6CCASAGGAG 
TAAAATrOGC 
^GCTGGGAGC 
CICCATTGCA 
OGGAGOITGG 
TTACCTCTT6 
GGCTCTTAAC 
AGTGATTTTG 
ATGGOCACTC 
GAGGCCTAAC 
GCAGGAACAC 
CTCAGCATGG 
CTGTGGATGG 
AATTGTGTTG 
ATTCCCACAC 
ATGGCGTTCA 
AAGTTCCACC 
AAAAAAAAAA 



AGAGTGTOGT 
TGCAAATOGA 
ATGGTTGOGA 
AAGCGGTTTC 
TACTGCAATT 
ATGGGTGAGA 
GCGGGCCTCA 
ACTGGCTCCA 
GTTTGACTTC 
CCTCAAGGGT 
GTGACAAGTT 
TTTTCCTTGA 
TACCACTCAT 
TGGGGGAGTC 
GGGGCAGTGG 

CTGcrrrrcc 

AAGAAACTTA 
GTGTGTGTTC 
TTTCTCrGTT 
AAAAACAAAT 
AAAAAAAAAA 



GTCATGTTTG 
CAGAGCCATA 
AGCAGTGCTC 
TGCTGGAACA 
TAGAGGGGCC 
GCTGTC3GTQG 
GCCTGTCTTG 
GAOCGTTGTC 
CQGGGTCTT 
TCTTTAACTC 

TTTCTcrrrG 

CTCCCCTCTG 
GGAGAGTATG 
TGAATGATTG 
GGCACACGTT 
TCAACCT7TC 
GACTTCACCC 
AACATCTGAA 
AAGATGCAGC 
ACAAGGGGAC 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 

loao 

1140 
1200 
1260 
1320 
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MALLALLLVV ALPRVWTDAN LTARQRDPBD SQRTDECSJNR VWCHVCEREN TFEOQNPRRC 
KHTEPYCVIA AVKIPPRPPM VAKQCSAGCA AMERPKPEEK RFLLEEPMPP PHiKCCKIRY 
CNLEXSPPINS SVFKEYAGSM GESCGGLWTA ILLOjIiASIAA QLSIS 

Seq ID NO: 84 DNA sequence 

Nucleic Acid Accession ft; NM__022893.1 

Coding sequence: 229-2726 



TTTTTTTTTT 
TGCGCCATCT 
TTTTCTCTGG 
OGCCCGCCGC 
AAGCAAGGCA 
ATTCTTACAG 
CTCCTCACCT 
GAGCACAAAC 
CCTTCCCCTT 
GTCACGCCAG 
GAACACATAG 
GGAGCTCTAA 
GATGAGCCCA 
CTCTTGCAAC 
AGTCCCCTGA 
CCACCTCTCC 
GGATCAGTAT 
CTGTTTAGTC 
GAAGAAATGG 
CCAATGGCTA 
AACACGTCTA 
CCATTCCAGC 
TCCGCCCCTC 
ACGTTCAAAT 
TACAAGTGCA 
AAGACX3CACA 
GCCAGCTCCC 
TCOGTGGTGG 
GAGGAGGAAG 
CTGACGGAGA 
CACGAGAACA 
GACGTCATGC 
GTCCTGGGCG 
TGCGACGAAG 
CGCGGCTGCT 
AGCCCCAGCT 
CCCCOGGCCA 
GCCTCCAGGC 
GCCTCCTCGT 
GAGCTGGACG 
ATTAGTGGTC 
GAGTACTGTG 
ACGGGCGAAA 
CTCACCAGGC 
TGTAAGATGC 
GATCGAGTGT 
CTCCCACCTG 
CCTGTAGGAT 
ACGAAGCTAA 
TTCTTTTTTC 
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1 

TTTTTTGCTT 
TTGTATTATT 
AGTCTCCTTC 
CGCCGCCGCC 
AACCCCAGCA 
ATGATGAACC 
GTGGGCAGTG 
GGAAACAATG 
CACCAATGGA 
AAGATGAOGA 
CAGATAAACT 
TCCCCACGCC 
GCAGCTACAC 
ACGCACAGAA 
CCCCGCGGGT 
ATG6GATTCA 
CGAGAGAGGC 
CACCACCGAG 
CCCTGGCCAC 
TGGAGCCTCC 
GCCCACCGCT 
CAGGTAGCAA 
CTCCCTCCCA 
TTCAGAGCAA 
ACCTGTGCGA 
TGCACAAATC 
CGGAACCCGG 
CCAAGTTCAA 
AGGAGGAOGA 
GCGAGAGGGT 
GCTCGCGGGG 
AGGGCATGGT 
AGAAGCATAA 
ACTCGGTGGC 
CCCCGGGOGA 
OGCTGAGGCC 
CGATGCCCAA 
AGCTCAAAGA 
CGGAGCACTC 
GAGGGATCTC 
CG6GCACG0G 
GGAAAGTCTT 
GGCCTTATAA 
ACATGAAAAC 
CTTTTAGOGT 
TGAATAATGA 
ACACCCCCTT 
TTTTTTCTAG 
GAATATGAGA 
TTTTTCCTTT 
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AAAAAAAAGC 
TCTAATTTAT 
TTTCTAACCC 
GCCGCCGCCG 
CTTAAGCAAA 
AGACXavCGGC 
CCAGATGAAC 
CAATGGCAGC 
GATGAAAAAA 
TTGTTTATCA 
TCTGCACTGG 
TGGGATGAGT 
ATGTACAACT 
CACTCATGGA 
TGGTATCCCT 
TATTGCAGAC 
TTCCGGCCTG 
ACATCACTTG 
CCATCACCCG 
CX3CCATGGAT 
GTCCCCAGGC 
GCCGCCCTTC 
GCCCXXX3GTC 
CCTGGTGGTG 
CCACGCGTGC 
GTCCCCCATG 
CACCAGCGAC 
GAGOGAGAAC 
CGAGGAAGAG 
GGACTACGGC 
CGCGGTCGTG 
GCTCAGCTCC 
GCGCGGCCAC 
CGGCX5AGTCG 
GTCGGCCTCG 
CTTCTCTAAG 
CAOGGAGAAC 
TCCCTTCCTT 
CTCGGAGAAC 
GGGGCGCAGC 
CAGGCCCAGC 
CAAGAACTGT 
ATGCGAGCTG 
GCATGGCCAG 
GTACAGTACC 
TATAAAAACT 
TTTCACCACT 
TCCCATGTGA 
GTGCTTGTCA 

rrxiii ' T ' rr T 
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I 

CATGAOGGCT 
TTTGGATGTC 
GGCTCTCOOG 
CCCGCCCCGC 
CGGGAATTCT 
CCGTTGGGAG 
TTCCCATTGG 
CTCTGCTTAG 
GCATCCAATC 
AOGTCATCTA 
AGGGGCCTCT 
GCAGAATATG 
TGCAAACAGC 
TTAAGAATCT 
TCAGGACTAG 
AATAACCCCT 
GCAGAAGGGC 
GACCCCCACC 
AGTGCCTTTG 
TTCTCTAGGA 
OGGCCCAGCC 
CTGGOGAOGC 
AAGTCCAAGT 
CACCGGCGCA 
ACCCAGGCCA 
ACGGTCAAGT 
TTGGTGGGCA 
GACCCCAACC 
GAAGAAGAGG 
TTOGGGCTGA 
GGOGTGGGOG 
ATGCAGCACT 
CTGGCOGAGG 
GACGGCATAG 
GGGGGCCTGT 
OGCATCAAGC 
GTGTACTCGC 
AGCTTOGGAG 
GGGAGCTTGC 
GGCACQGGAA 
TCAAAAGAGG 
AGGAATCTCA 
TGCAACTATG 
GTGGGGAAGG 
CTGGAGAAAC 
GAATAGAGGT 
CCCTTTCCCC 
TTTAAACAAA 
CCAGCACACC 
TOCTTTATGT 
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CTCCXy^CAAT 
AAAAGGCACT 
ATGTGAACCG 
AGCCCACCAT 
CGCC0GA6CC 
CTCCAGAAGG 
GGGACATTCT 
AAAAAGCTGT 
COGTGGAGGT 
GAAGAATTTG 
CCTCCCCTOG 
CCCCX3CAGGG 
CATTCACCAG 
ACTTAGAAAG 
GTGCAGAATG 
TTAACCTGCT 
GCTTTCCACC 
GCATAGAGCG 
ACAGGGTGCT 
GACTTAGAGA 
CTATGCAAAG 
CCCCCCTCCC 
CATGC6AGTT 
GCCACACGGG 
GCAAGCTGAA 
CCGACGACGG 
GCX3CCAGCAG 
TGATCCCGGA 
AGGAAGAGGA 
GCCTGGAGGC 
AOGAGAGCOG 
TCAGCGAGGC 
CCGAGGGCCA 
ACGATGGCAC 
CCAAAAAGCT 
TGGAiGAAGGA 
AGTGGCTCGC 
ACTCC3VGACA 
GCTTCTCCAC 
GTGGAGGGAG 
GCAGACXSCAG 
CTGTCCACAG 
CCTGTGCCCA 
ACGTTTACAA 
ACATGAAAAA 
ATATTAATAC 
ATCGCCCTCC 
CAAACAAACA 
TGTTTTTTTT 
TCTCACCGTT 



51 

! 

TCATCTTCCC 
GATGAAGATA 
AGCCGTCGTC 
GTCTOGCCGC 
TCTTGAAGCC 
GGATCATGAC 
TATTTTTATC 
GGATAAGCCA 
TGGCAtCCAG 
CCCCAAACAG 
TTCTGCACAT 
TATTTGTAAA 
TGCATGGTTT 
CGAACACGGA 
TCCTTCCCAG 
AAGAATACCA 
CACTCCCCCC 
CCTGGGG60G 
GCX3GTTGAAT 
GCTGGCAGGG 
GTTACTGCAA 
TCCTCTGCAA 
CTGCGGCAAG 
CGAGAAGCCC 
GCGCCACATG 
TCTCTCCACC 
CGCGCTCAAG 
GAACGGGGAC 
GGAGGAGGAG 
GGCGOSCCAC 
OGCCCTGOCC 
CTTCCACCAC 
CAGGGACACT 
TGTTAATGGC 
GCTQCTGGGC 
QTTOGACCTG 
CGGCTACGC6 
ATCGCCTTTT 
ACCGCCCGGG 
CACGCCCCAT 
CGACACTTGT 
GAGAAGCCAC 
GAGTAGCAAG 
ATGTGAAATT 
ATGGCACAiGT 
CCCrCCCTCA 
AGCCCCACTC 
AACAGAAGTA 
CTTTTTCTTT 
TGAATGCATG 



60 
120 



60 

120 

180 

240 

300 . 

360 

420 

460 

540 

600 

660 

720 

780 

840 

900 

960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1660 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2680 
2940 
3000 



220 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 
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ATCIGTATOS GGCAATACTA TTGCATTTTA OGCaAACTTT GAGCCTTTCT CTTGTGCAAT 3060 

AATTTACATG TTGTGTATGT TTTTTTTTAA ACTTAGACAG CATGTATGGT ATGT TATGG C 3120 

TATTTTAAAT TGTCCCTAAT TOGTTGCTGA GCAAACATGT TGCTGTTTCC AGTTCOGTTC 3180 

T6AGAGAAAA AGAGAGAGAC AGAGAAAAAG ACX3VTGCTGC ATACATTCTG TAATACATAT 3240 

CATCTACAGT TTTATTTTAT AACGTGAGGA GC5AAAAACAG TCTTTGGATT AACCCTCTAT 3300 

AGACAGAATA GAXAGCACT6 AAAAAAAATC TCTATGAGCT AAATGTCTGT CTCTAAAGGG 3360 

TTAAATGTAT CAATTOtSAAA GGAAlGAAAAA AGC3CCTTGAA TTGACAAATT AACAiGAAAAA 3420 

CAGAACAAGT TTATTCTATC ATTTGGTTTT AAAATATGAG TGCCTTGGAT CTATTAA AAC 3480 

CACATCGATG GTTCTTTCTA CTTGTTATAA ACTTGTAGCT TAATTCAGCA TTGG6TGAGG 3540 

TAATAAACCT TAGGAACTAG CATATAATTC TATATTGTAT TTCTCACAAC AATGGCTACC 3600 

TAAAAAGATG ACCCATTATG TCCTAGTTAA TCATCATTTT TCCTTTAGTT TAATTTTATA 3660 

AACAAAACTG ATTATAOCAG TATAAAAGCT ACTTTGCTCC TGGTGAGAGC TTAAAAGAAA 3720 

TGGGCTGTTT TGOCCAAAGT TTTATTTTTT TTAAACAATG ATTAAATTGA ATGTGTAATG 3780 

TCCAAAAGCC CTGGAAOGCA ATTAAATACA CTAGTAAGGA GTTCATTTTA TGAAGATATT 3840 

TCCTTTAATA ATGTCTTTTT AAAAATACTG GCACCAAAAG AAATAGATCC AGATCTACTT 3900 

GGTTGTCAAG TGGACAATCA AATGATAAAC TTTAAGACCT TGTATACCAT ATTGAAAGGA 3960 

AGAGGCTGAC AATAAGGTTT GACAGAGGGG AACAGAAGAA AATAATATGA TTTATTAGCA 4020 

CAAOSTCGTA CTATTTOOCA TTTAAAACtA GAACAGGTAT ATAAGCTAAT ATTGATACAA 4080 

TCATGATTAA CTATGAATTC TTAAGACTTG CATTTAAATG TGACATTCTT AAAAAAAGAA 4140 

GAGAAAGAAT TTTAAGAGTA GCAGTATATA TGTCTGTGCT CCCTAAAAGT TG TACTTC AT 4200 

TTCTTTTCCA TACACTGTGT GCTArTTGTG TTAACATGGA AGAGGATTCA TTGTTTTTAT 4260 

TTTTATTTTT TTAATTTTTT CTTTTTTATT AAGCTAGCAT CTGOCOCAGT TGGTGTTCAA 4320 

ATAGCACrre ACTCTGCCTG TGATATCTGT ATCTTPTCTC TAATCAGAGA TACAGAGGIT 4380 

GAGTATAAAA TAAACCTGCT CAGATAGGAC AATTAAGTGC ACTGTACAAT TTTCCCAGTT 4440 

TACAGGTCTA TACTTAAGGG AAAAGTTGCA AGAATGCTGA AAAAAAATTG AACACAATCT 4500 

CATTGAGGAG CATTTTTTAA AAACTAAAAA AAAAAAAACT TTGCCAGCCA TTTACTTGAC 4560 

TATTGAGCTT ACTTACTTGG ACGCAACATT GC»AGCGCTG TGAATGGAAA CAGAATACAC 4620 

TTAACATAGA AATGAATGAT TGCTTTCGCT TCTACAGTGC AAGGATTTTT TTGTACAAAA 4680 

CTTTTTTAAA TATAAATCTT AAGAAAAATT TTTTTTAAAA AACACTTCAT TATGTTTAGG 4740 

GGGGAACTGC ATTTTAGGGT TCCATTGTCT TGGTGGTGTT ACAAGACTTG TTATCCATTT 4800 

AAAAATGGTA GTGGRAATTC TATGCCTTGG ATACACACCG CTCTTCAGGT TGTAAAAAAA 4860 

AAAAACATAC ATTOGGGAAA GGTTTAAGAT TATATAGTAC TTAAATATAG GAAAATX3CAC 4920 

ACTCATGTTG ATTCCTATGC TAAAATACAT TTATGGTCTT TTTTCTGTAT TTCT AGAATG 4980 

GTATTTGAAT TAAATGTTCA TCTAGTGTTA GGCACTATAG TATT TATA TT 6AAGCTTGTA 5040 

TTTTTAACTG TTGCTTGTTC TCTTAAAAGG TATCAATGTA CCTTTTTTGG TAGTGGAAAA 5100 

AAAAAAGACA GGCTGCCACA GTATATTTTT TTAATTTGGC AGGATAATAT AGTGCAAATT 5160 

ATTTGTATGC TTCAAAAAAA AAAAAAAGAG AGAAACAAAA AAGTGTGACA TTACAGATGA 5220 

GAAGCCATAT AATGGCGGTT TGGGGGAGCC TGCTAGAATG TCACATGGAT GGCTGTCATA 5280 

GGGGTTGTAC ATATCCTTTT TTGTTCCTTT TrCCTGCTGC CATACTGTAT GCAGTACTGC 5340 

AAGCTAATAA CGTTGGTTTG TTATGTAGTG TGCTTTTTGT CCCTTTCCTT CTATCACOCT 5400 

ACATTCCA6C ATCTTACCTT CATATGCAGT AAAAGAAAGA AAGAAAAAAA AAGGAAAAAA 5460 

AAAAAAAAAC CAATGTTTTG CAGTTTTTTT CATTGCCAAA AACTAAATGG TGCTTTATAT 5520 

TTAGATTGGA AAGAATTTCA TATGCAAAGC ATATTAAAGA GAAAGCCCGC TTTAGTCAAT 5580 

ACTTTTTTGT AAATG6CAAT GCAGAATATT TTGTTATTGG CCmTCTAT TCCTGTAATG 5640 

AAAGCTGTTT GTCGTAACTT GAAATTTTAT CTTTTACTAT GGGAGTCACT ATTTAT TATT 5700 

GCTTATGT6C CCTGTTCAAA ACAGAGGCAC TTAATTTGAT CTTT TATTTT TCTTTG TTTT 5760 

TATTTTTTTT TTTATTTAGA TGACCAAAGG TCATTACAAC CTGGCTTTTT ATTGT ATTT G 5820 

TTTCTGGTCT TTGTTAAGTT CTATTGGAAA AACCACTGTC TGTGTTmT TGOCAGTTGT 5880 

CTGCATTAAC CT6TTCATAC ACCCATTTTG TCCCTTTATT GAAAAAATAA AAAAAATTAA 5940 
A 
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Seq ID NO: 65 Protein sequence: 
Protein Accession »t WP_075044.l 



1 
I 

MSRRXQGKPQ 
LIFIEKKRKQ 
CPKQEHIADK 
SAWPLLQHAQ 
LRIPGSVSRE 
IiRLNPMAMEP 
PPLQSAPPPS 
KRHMKTHMUK 



RALPDVMQGM 
TVNGRGCSPG 
AGYAASRQUC 
STPHISGPGT 
OSSfOiTRHHK 



11 
1 

HLSKREFSPE 
CNGSLCLEKA 
LLHWRGLSSP 
NTHGLRIYLB 
ASGLAEGRFP 
PAMDFSRRLR 
QPPVKSKSCE 
SSPMTVKSDD 
DEBEBB&BEE 
VLSSMQHPSB 
ESASGGLSKK 
DPFLSFGDSR 
GRPSSKEGRR 
TH6QV6KDVY 



21 
I 

PLSAZLTDDE 
VDKPPSPSPZ 
RSAHGALIPT 
SEHGSPLTPR 
PTPPLFSPPP 
ELAGNTSSPP 
FCGKTFKFQS 
GLSTA5SPEP 



APEQVL6BKH 

LLLGSPSSLS 
QSPFASSSEH 
SDTCEYCGKV 
KCBICKMPFS 



31 
I 

PDHGPLGAPB 
EMKKASNPVB 
PGMSABYAPQ 
VGIPSGLGAE 
RHHLDPHRIE 
LSPGRPSPMQ 
NLWHRRSHT 
GTSDLVGSAS 
VDYGFGLSIiE 
KRGHLAEAE6 
PFSKRIKLEK 
SSENGSLRFS 
FKNCSNLTVH 
VYSTLEKHMK 



41 
I 

GDHDIiLTCGQ 
VGIQVTPEDD 
GICXDEPSSY 
CPSQPPLHGI 
RLGAEEMALA 
RLLQPFQPGS 
GEKPYKCKLC 
SAliXSWAKF 



HROTCDEDSV 

EFDLPPATMP 
TPPGBIiDGGI 
RRSHTGERPY 
KHHSD8VU1N 



51 
I 

CQPWFPtGDI 
DCLSTSSRRI 
TCTTCKQPFT 
HIADNNPFNL 
THHPSAFORV 
KPPFLATPPL 
DHACTQASKL 
KSENDPKLIP 
GAWGVGDES 
AGESDRIODG 
NTENVYSQWL 
SGRSGTGSGG 
RCEU3IYACA 
OZKTE 



Seq ID NO: 86 DNA sequence 

Kucleic Acid Accession #: XM_03S292.2 

Coding sequence: 53-1576 



1 
I 

GCTCX3CTGGG 
TGCGGGCCCX3 
GGAGAAGATG 
CGTGACCCTG 
TATCGGCTGG 
GCTGGOGCTG 
CGCX3GAGCTC 
CTACGGCTCG 
ATCGCAGTAC 



11 
I 

CCGCGGCTCC 
AAGCGG06CG 
CTGGCCGCCA 
CAGCGGAACA 
GGCATCTT06 
GIGGTGTGGG 
GGCACCACCA 
CTGCCCGCCT 
ATOGTGGCCC 



21 
I 

OGGGTGTCXX 
CGCTAGOGGC 
AGAGCGOGGA 
TCACGCTGCT 
TGACGGGCAC 
CC6C3GTGOGG 
TCTCCAAATC 
TCCTCAAGCT 
TGGTCTTOGC 



31 
I 

AGGCCOGGCC 
GCCX3GCX3GCC 
CGGCTCGGOG 
CAACGGCX3TG 
GGGCGTGCTC 
OGTCTTCTCC 
OGGGGGCGAC 
CTGGAT OGAO 
CACCTACCTG 



41 

1 

GGTGC6CAGA 
GAGGAGAAGG 
COGGCAGGOG 
GCCATCATCX3 
AAGGAGGCAG 
ATCGTGGGOS 
TAOGCCTACA 
CIGCTCATCA 
CTCAAGOCGC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



51 
J 

6CATGG0GGG 
AAGAGGOGGG 
AGGGCGAGGG 
TGGGGACCAT 
GCTCGCCGGG 
CXSCrCTGCTA 
TGCTGGAGGT 
TOOGGGCTTC 
TCTTCOOCAC 



60 
120 
180 
240 
300 
360 
420 
480 
540 



221 
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CTGCCOGGTG GOCGAGGAGG CRGOCAAGCT OGTGGCCTGC CTCTGayrGC TGCTGCTCAC 600 

GGCCGTGAAC TGCTACftGOG TGAAGGCOGC CACCOGGGTC CAGGAtCCCT TTO00GCG6C 660 

CAAGCTCCTG GCOCTGGCCC TGATCATCCT GCTGGGCTTC GTCCAGATCG GGftflGGGTGA 720 

TGTGTCCftAT CT3VGATCCCA ACTTCTCATT TGAAGGCACC AAACTGGATG TGQGGAACAT 780 

TGTOCTGGCA TTATACAGCG GCCTCTTTGC CTATGGAGGA TGGAATTACT TGAATTTCGT 840 

CACAGAGGAA ATGATCAACC CCTACAGAAA CCTGCCCCTG GCCATCATCA TCTCCXTTGCC 900 

CATOGTGAOG CTGGTGTAOG TGCTGACXAA CCTGGCCTAC TTCACCACOC TCTCCACOGA 960 

GCAGATGCTC TCGTCCGAGG COGTGGCOGT GGACTTCGGG AACTATCACC TQGGCGTCaT 1020 

GTCCTGGATC ATCCCOGTCT TOSTCGGCCT GTCCTGCTTC GGCTCCGTCA ATGGGTCCCT 1080 

GTTCACaTCC TCCAGGCTCT TCTTCGTGGG GTCCOGGGAA GGCCACCTGC CCTCCATCCT 1140 

CTCCATGATC CACCCACAGC TCCTCACCCC CGTGCCGTCC CTOGTGTTCA OGTGTGTGAT 1200 

GAOSCrGCTC TACGCCTTCT OCAACGACAT CTTCTCOGTC ATCAACTTCT TCAGCTTCTT 1260 

CAACTGGCTC TGCX3TGGCCC TGGCCATCAT CX^GCATGATC TGGCTGOGCC ACflGAAAGCC 1320 

TCAGCTEGAG CGGaXSVTCA AGGTGAACCT GGCOCTGCCT GTGTTCTTCA TOCTGGCCTG 1380 

CCTCTTCCrG ATOSCCGTCT CCTTCrCGAA GACACCGGTG GAGTGTGGCA TCGGCTTCAC 1440 

CATCATCCTC AGCGGGCTGC COGTCTACTT CTTCGGGGTC TGGTGGAAAA ACAAGCCCAA ISOO 

GTGGCTCCTC CAGGGCATCT TCTCCACGAC CGTCCTGTGT CAGAAGCTCA TGCAGGTGGT 1560 
CCOXaVGGAG ACATAGCCAG GAGGCGGAGT GGCTGCOGC» GGACCATGC 

Seq XO HO: 87 Protein sequence: 
Protein Accession tfi XP_035292.2 

1 11 21 

i I i 

HAGAGPKRRA LAAPAAEEKE EARERMLAAK 
GTZIGSGIFV TPTGVLKEAG SFGIALWKA 
LEVYGSIiPAF LKLWIELl.II RPSSQYIVAL 
LLTAVNCYSV KAATHVQDAF AAAKLLALAL 
(2IIVLALYSG LFAYGGWNYL NFVTEEMINP 
STBQMLSSEA VAVDFGNYHL GVMSWIIPVF 
SILSMZHPQL LTPVPSLVFT CVMTLLYAFS 
RKPELBSPIK VKLALPVFPI LACLFLIAVS 
KPiWLLQGIF STTVLCQKLM QWPQBT 



31 41 51 

I I 1 

SADGSAPAGE GBGVTLQBNI TLUfGVAIIV £0 

AOGVFSIVGA LCYAELGTri SKSGGDYAYM 120 

VFATYLLKPL PPTCPVPBBA AKLVACLCVL 180 

IILLGFVQIG KGDVSNLDPN FSFEGTKLDV 240 

YRNLPLAIH SLPIVTLVYV LTNLAYFTTL 300 

VGLSCFGSVN GSLPTSSRLP FVGSREGHLP 360 

KDIPSVIMFF SFFMWLCVAL AIIGMIWLRH 420 

FWKTPVBOGI GPTIILSGLP VYPPGVWHKN 480 



Seq ID KO: 88 ONA sequence 

Nucleic Acid Accession 6: NM_00526e.l 

Coding sequence t 166-989 

1 11 21 31 41 51 

I I i i I I 

TAAAAAGCAA AAGAATTCGC GGCCGCGTCG ACACGGGCTT CCCCGAAAAC CTTCCCCGCT 60 

TCTGGATATG AAATTCAAGC TGCTTGCTGA GTCCTATTGC OGGCTGCTGG GAGCCAGGAG 120 

AGCCCTGAGO AGTAGTCACT CAGTAGCAGC TGACGOGTGG GTCCACCATG AACTGGAGTA 180 

TCTTTGAGGG ACTCCTGAGT GGGGTCAACA AGTACTCCAC AGCCTTTGGG CGCATCTGGC 240 

TGTCTCTGGT CTTCATCTTC CGCGTGCTG6 TGTACCTGGT GA0G6CCGAG OGTGTGTGGA 300 

GTCATGACCA CAAGGACTTC GACTGCAATA CTCGCCAGCC GGGCTGCTCC AACX5TCTGCT 360 

TTGATGACTT CTTCCCTGTG TCCCATGTGC GCCTCTGGGC CCTGCAGCTT ATCCTGGTGA 420 

CATGCCCCTC ACTGCTCGTG GTCATGCACG TGGCCTACCG GGAGGTTCAG GAGAAGAGGC 480 

ACCGAGAAGC CX:ATCGGGAG AACAGTGGGC GCCTCTACCT GAACCCCGGC AAGAAGCGGG 540 

6TGGGCTCTG GTGGACATAT GTCTGCAGCX: TAGTGTTCAA GG06AGCGTG GACATCX3CCr 600 

TTCTCTATGT GTTCCACTCA TTCTACOOCA AATATATCCT CCCTCCTGTG GTCAAGTGCC 660 

ACGCAGATCC ATGTCCCAAT ATAGTGGACT GCTTCATCTC CAAGCCCTCA GAGAAGAACA 720 

TTTTCACCCT CTTCATGGTG GCCACAGCTG CCATCTGCAT CCTGCTCAAC CTCGTGGAGC 780 

TCATCTACCT GGTGAGCAAG AGATGCX3UX3 AGTGCCTGGC AGCAAGGAAA GCTCAAGCCA 840 

TGTGCACAGG TCATCACCCC CACGGTACCA CCTCTTCCTG CAAACAAGAC GACCTCCTTT 900 

CGGGTGACCT CATCTTTCTG GGCTCAGACA GTCATCCTCC TCTCTTACCA GACCX3CCCCC 960 

GAGACCATGT GAAGAAAAGC ATCTTGTGAG GGGCTGCCTG GACTGGTCTG GCAGGTTGGG 1020 

CCTGGATGGG GAGGCTCTAG CATCTCTCAT AGGTGCAACC TGAGAGTCGG GGAGCTAAGC 1080 

CATGAGGTAG GGGCAGGCAA GAGAGAGGAT TCAGAOSCTC TGGGAGCCAS TTCCTAGTCC 1140 

TCAACTCCAG CCACCTGCCC CAGCTOGACG GCACTGGGCC AGTTCOCCCT CTGCTCT6CA 1200 
GCTCGGTTTC CTTTTCTAGA ATGGAAATAG TGAGGGCCAA TGC 

Seq ID KO: 89 Protein sequence: 
Protein Accession NP_005259.l 

1 11 21 31 41 SI 

ilNWSIFEGLL SGVNKYSTAF GRIWLSLVFI FRVLVYLVTA ERVWSDDHKD FDCNTRQPGC 60 

SNVCFDEPPP VSHVRLHAIiQ LILVTCPSLIi WMHVAYREV QBKRHREAHG ENSGRLYLNP 120 

GKKRGGLWWT YVCSLVFKAS VDIAFLYVPH SFYPKYILPP WKCHADPCP NIVDCFISKP 180 

SEKNIFTLFM VATAAICILL NLVELIYLVS XRCHECLAAR KAQAMCTGHH PHGTTSSCKO • 240 
DDLLSGDLIF LGSDSHPPLL PDRPRDHVXK TIL 



Seq ID NO: 90 DNA sequence 

Nucleic Acid Accession i: NM_002391.l 

Coding sequence: 26-457 

1 11 21 31 41 51 

1 I I 1 I I 

CXIGGCGAAGC AGOGOGGGCA GCGAGATGCA GCACCGAGGC TTCCTCCTCC TCACCXTTCCT 60 
C6CCCTCCT6 GOGCTCACCT CX3GCGGTCGC CAAAAAGAAA GATAAGGTGA AGAAGGGCGG 120 
CCQGGGGAGC GAGTGOGCTG AGTGGGCCTG QGGGCCCTGC ACOCCCAQCA GCAAGGATTG 180 
CGGOGTGGGT TTCCGCGAGG GCACCTGCGG GGCCCAGACC CAGCGCATCC GGTGCAGGGT 240 
GCCCPGCAAC TGGAAGAAGG AGTTTGGAGC CGACTGCAAG TACAAGTTTG AGAACTGGGG 300 
TGaSTGTGAT GGGGGCACAG GCACCAAAGT CCGCCAAGGC AGCCTGAAGA AGGOGCGCTA 360 



222 



10 



15 



WO 02/086443 

CAATGCTCAG TGCCAGGAGft 
AAAGGCCAAA GCCAAGAAAG 
GCCCCTGGTG TCACATGGGG 
CACCAGTGCC TTCTGTCTGC 
ACrOCCCAGC GCOkOCOCIA 
TGA6CCTOCC CCAAAGCAAT 
ATTACTAAGIl AACACATCAA 
TAMAT 



Seq ID NO: 91 Protein sequence: 
Protein Accession ft: KP_002382.1 



PCTAIS02/12476 



GCATO0GO6T 
GGAAOGGAAA 
CCTOGCCAOG 
TOGTTAGCTT 
AGTGCCCAAA 
GTGASTOCCA 
ATAAACTGAC 



CAOCAAGCOC 
GGACTAGACG 
CCCTCCCTCT 
TAATCAATCA 
GTGGGGAGGG 
GAGCCCGCTT 

Trrrrccccc 



TGCACCOOCA 
CCAAGOCTGG 
CCCAGGOCOG 
TGOCCTGOCT 
ACAAGGGATT 
TTGrrCTTCC 
CAATAAAAGC 



AGACCAAAGC 
ATGCCftAGGA 
AGATGTGAOC 
Itfl 'L' L 'C l 'CTC 
CTGGGAAGCr 
CCACAATTCC 

TCTTcrrrrr 



1 11 21 31 41 51 

KQHRGFLLLT LLALLALTSA VAKKKDKVKK GGPGSECAEW AKGPCTPSSK DCCSVGFREX3T 
CGAQTQRIRC RVPCNWKKEP GADCKYKFEN WGAC0GGTGT KVRQGTUOCA RYNAQCQBTI 
RVTKPCTPKT KAKAKAKKGK GKD 



420 
480 
540 
600 

660 
720 
780 



60 
120 



20 



25 



30 



35 



40 



45 



Seq ID NO: 92 DNA sequence 

Nucleic Acid Accession NM_005130,1 

Ceding sequence: 98-802 



CTCTACCTGA 
OGTGTGCTCA 
GCTCTCCTTC 
GAATGGACTT 
TAAGCAGAAA 
CAGAT6GGCT 
GGACCATGAA 
TGAGAGAGTC 
ATATTCCAAG 
TAAGCTAGTC 
GTCCCCCAG6 
GACCATGGCC 
GACTGCCCTG 
AGTGCAGGAC 
TGTCGTAAGT 
TGTGCTTAGT 
TGGAATTTGC 
TTCXATGGCX: 
GAGTGATAAT 
TTTTTCAAAA 



11 
1 

CACAGCTGCA 
GAACAAGGTG 

CTCCTACTGG 
CACAGCAAAG 
AGCAGGCCCG 
GCTACTGAGC 
TTTTCCTGTG 
TATTGGAAAC 
ACAGCTGTGA 
AGCTCCACTC 
GAGCACATCA 
ACCAAAGCTC 
GAGTTCTGTG 
ACGTCATGCT 
CCCTCTGTAT 
GAGTGCAACG 
CTTATTTTTC 
CACACaVGCTA 
TTCAGTGCAA 
AAAAAAAAAA 



21 
I 

GOCTGCAATT 
AA05CCCAGC 

CTGCTCAGGT 
TGGTCTCAGA 
GGAACAAAGG 
AGGAGGAGGG 
TCTTTCCTGG 
AAGTTGCCCG 
AAACCAGAGT 
TATTTGGGAA 
AGGGCAAAGA 
CCGAGTGTGT 
GAGAGACTTG 
AATGAGGTCA 
ACTTTAAAGC 
AAATATTTAA 
TTGGATGCGA 
TGTGTTTGAG 
CGAACTTTCT 



31 
I 

CACTCCGACT 
TGCAGCCftTG 
GCTCCTGGTG 
ACAAAAGGAC 
CAAGTTTGTC 
CATCTCTCTC 
CAATCCAACC 
GAATCTGCGC 
GTGCAGAAA6 
CACAAASCCC 
GACCACCCCC 
GGAGGACCCA 
GAGCTCTCTC 
AAAGA6AA0G 
TCTCTACAGT 
ACAAGTTTTG 
TGTTCAGAGG 
CAGGGAAGAG 
GCTGAATTAA 



41 

I 

GOCTGGGATT 
AAGATCTGTA 
GAGGGGAAAA 
ACTCTGGGCA 
ACCAAAGACX: 
AAGGTTGAGT 
TCATGCCTAA 
TCACAGAAA6 
GATTTTCCAG 
AG6AAGGAGA 
TCTAGCCTAG 
GATATGGCAA 
TCCACATTCT 
GGTTCCTTTA 
CCCOCCAAAA 
TATTTTTTGC 
CTGTTTCCTG 
TCTTTGAGCT 
TGGTAATAAA 



51 
I 

GCACTGGATC 
GCCTCACCCT 
AAAAAGTGAA 
ACACCCAGAT 
AAGCCAACTG 
GCACTCAATT 
AGCTCAAGGA 
ACATCTGTAG 
AATCCAGTCT 
AAACAGAGAT 
CAGTGACCCA 
ACCAGAGGAA 
TCCTCAGCAT 
AGAGATGTCA 
TATGAACTTT 
TTTTGTGTTT 
CAGCATGTAT 
GAATGAGCCA 
ACTCTGGGTG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



Seq ID NO: 93 Protein sequence: 
Protein Accession 8: NP_005121.l 

50 1 11 21 31 41 51 

MKICSLTU.S PLLLAAQVIjL vbgkkxvkng lhskwseqk dtlgntqxkq ksrpgnkgkp 

VTKDQANCRW AATEQEBGIS LKVBCTQLDH BFSCVFAiSNP TSCLKLKDER VYWKQVARNL 
RSQKDIC»YS KTAVKTRVCR KDFPESSLKL VSSTIiPOJTK PRKBKTEMSP aKHIKGKETT 
55 PSSLAVTQTM ATKAPECVED PDMANQRKTA LEPOGETWSS LCTPFLSIVQ DTSC 



60 
120 
180 



60 
65 
70 
75 
80 
85 



Seq ID NO: 94 DNA sequence 

Nucleic Acid Accession fti NH_012101 

Coding sequence: 125-1891 



CTCCTCACAG 

tgccagaaag 
tgcgatggaa 

CCGGAGCCCG 
TGCCAAGACC 
CCTGAAGCCA 
CATCCAGTTT 
GGAA66CAAG 
TACCTTT6GC 
GGTGTCCATC 
CCTTTTTTCA 
CAAGCAGAAG 
CAAGCCCCAC 
CTTTGAGGCC 
CCA6ACCTGC 
AGTGGAGGA6 
GCTCAAGATC 
CAAGAGCTTC 
GGACCTGGAG 
TGTGGACCAA 
GGACAAGCAG 
ATTTGGTGCA 
GCTGGAGGGG 
ATGC3VTGCGC 
GAAGCACATG 



11 

I 

GTGTGTCTCT 
GTCACCTATC 
GCTGCAGATG 
TCGGGCCCCA 
ACCAACGGGC 
GGGGAAGGTA 
GTCX3AG TCCG 
AGGT06C0GT 
GAAAAGGG06 
ATGGAGCCCG 
CGGTCCAAGT 
GCGGTCAAGT 
CTGCAGGGCG 
OGCAAGTGTC 
ATCTGCTACC 
GCCAAGGCGG 
ATTGA6ATTG 
ACCACCAATG 
AAGCAAAAGG 
GTGAAGGTGA 
ACCOSGGAGC 
TTGATGAGCA 
6AGGGCCTGG 
CACGTTGAGA 
GAGAACGGTG 



21 
! 

AGTCCTCGTG 
CTGAACCCCA 
CCTGCAG6A6 
GTGGCAGCCT 
ACGGCGGGGA 
GGAGCGCCCT 
GGGACGACAA 
ACGCAGGGCT 
ACGTGCGCAA 
GGGAGACCOG 
CCGGCTCCGA 
CCTGCCTGGT 
CCGCCTTCCG 
CCGTGCATGG 
TTPGCATGTT 
AGAAGGAGAC 
AGGATGAAGC 
AGAAGGCCAT 
AGGAAGTCAG 
TCATGGATGC 
AGCTGCATAG 
ATTACTCTCT 
GACAGTCACT 
AGATGTGCAA 
GTGACCATOG 



31 

1 

GTTGCCTGCC 
GCAAGCCTGA 
CAAOGGGTGG 
GGAGAATGGC 
GGCAGCTGA6 
GTTOGOGGGC 
GAACTCXAAC 
CCAGCTGGGG 
GTCCATTTTC 
GGGGAACAGC 
GGAGGTGCrQ 
GTGCCAGGCC 
AGACCACCAG 
CAAGACGATG 
CCAG6AGCAC 
G6AGCTGTGA 
T6AGAAGTGG 
CCTGGAGCAG 
GGCTGCGCTG 
TCTGGATGAG 
CATCAGCGAC 
CCCCCCACCC 
AGGCAACTTC 
GGCGGAOCTG 
CTATGTGAAC 



41 
I 

CCACTCCCTG 
AACAGCTCAG 
AGOCCAGAAG 
ACCAAG6CTG 
GGCAAGAGCX: 
AATGAGTGGC 
TACTTCAGCA 
GCPGCCAAGA 
TOGGAGTCCC 
TACCCCCGGG 
T60SACTGCT 
TCCTTCTGCG 
CTGCTCGAGC 
GAGCTCTTCT 
AAGAATCATA 
CTGCAAAAGO 
CAGAAG6AGA 
AACTTCCGGG 
GAGCAGCXsGG 
AGA6CCAAGG 
TCTOTGTTGT 
CTGCCCACCT 
AAGGAOGACC 
AGCX^GTAACT 
AACTACAG6A 



51 
1 

CCGAGAOGCC 
CCAAGCACCC 
CCAGGGATGC 
ACGGCAAGGA 
TGGGCAGOGC 
GGCX3ACCCAT 
TGGACTCTAT 
AGCCACCCGT 
GGAAGCCCAC 
CG6ACACGGG 
6CAT0G6CAA 
AGCTGCATCT 
CCATCCGGGA 
GCCAGACCGA 
GCACXXTTGAC 
AGCAGCTGCA 
AGGACOGCAT 
ACCTGGTGOG 
AGCAQGATGC 
TGCTGCATGA 
TTCTGCAGGA 
ATCATGTCCT 
TGCTCAATGT 
TCATTGAGAG 
ACAGCTT06G 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



GGGTGAGTGG 
TGGGGTCCGG 
GAAGAATTTC 
CTCCTCCAGC 
CTCCCTGAAA 
TTGGAAATCT 
CAACGGGATT 
CCCCTGCTCT 
TGGGAGCGAG 
GTTCCGGCCT 
TGACCTCAGA 
TAGGTTGGGG 
CAGTGAGTAC 
CTATAGACGT 
ACAGCCACCC 
GTGCTCrCTC 
CTGCAGATGG 
ACRGCCACTT 
TAGCCAAGAT 
TCCACCCATG 
TTCAGTCTAC 
GTGCCTTACA 
TGATTACCCC 
AGCAGCACAG 
GATGGGCCAG 
ATAAACC31TT 



AGTGCACCGG 
ACATCATACX: 
AACAATCTCT 
ATTCAGAACT 
GGCTATCCCT 
GGCAAGCAGA 
GGGTCCAAOG 
TCCTCCTGAC 
CCTGGTCCTG 
CTC30C3ACTTC 
TGGTCACCAT 
CCTGCCCTAA 
COGCAIGGTA 
TTCrCTCCAA 
ATCTCCCATT 
TCGTCCTACC 
AAACCTCTCA 
TGAGTCTGTG 
ATTCCTCTGT 
CAAATAGCTA 
ACTTTGGCAT 
CACTGCCCCX: 
CCATGTTGCA 
TGGGGACATC 
CTTGCAGGGG 



ACACX^TGAA 
AGCCCTCGTC 
ATGGCACCAA 
CTGACAATGA 
CCCTCATGOG 
CTATGCTGTC 
AAGCCCCATG 
OCTGCTCCTC 
CACCTGCCCT 
CCCACTOGCC 
CATTCCTGTG 
CCCGCCACCC 
TCAGCCTGCX: 
GGCCCTATCC 
CACATGGCCC 
TATCAATGCC 
GTGTCTTGAC 
GTCCCTGGAG 
TCCCTCTGCT 
CTGGCCCAGC 
TCTCTCTGGC 
ACCCTCAGCC 
TATCAGGGTC 
TCCCGTXrrCA 
TTGGGGAGGG 



GAGATACTCC 
TOCTGGCOGC 
A£3GTAACTAC 
CCTGCCCGTC 
GAGCCAAAGC 
TCACTACOGG 
AGC TCCTGGC 
TTGCCTTCTA 
CTGCAGCOCT 
ACACTCCATT 
CTCAGAGGCX: 
TCCTCCTCTC 
TCTCCCGCCC 
OCCAATGTTG 
ACCTOCTCCT 
CAGCATGGCA 
ATCAOOCTAC 
GGTGGCTTCT 
GAGATAAAGA 
TACCATTTAC 
GATGGAGTGT 
GTTGOCCCAT 
CTCAAGGATT 
ACAGCCCCAG 
AGACATCCAG 



ATGTACCTGA 
TTCAOCAAGG 
ACCTGCCGG6 
GTCCAAGGCA 
CCCAAGGCCC 
CCATTCTACG 
GGAAOGAAOG 
AGCTACTGTG 
CTGCCAGCCT 
CAGACTCCTT 
AACCCATCAC 
GGGCTGGATC 
AOSCCCTGCT 
TCAGCAGATG 
TCCCAGAGGA 
GAACCIGCAG 
CCAGGCGGTG 
CCTGACTGGC 
ATTCOCTTAA 
CATTTGOCTA 
GGCTOGGCTG 
CAGAGGCTGC 
GGAGAGGAGA 
G0CTATG6GG 
CTTGG6CTTT 



CAOCCAAAGG 
AGACCACCCA 
TCTGGG AGtA 
GCTOCTOCTT 
AGCCCCAGAC 
TCAACAAACG 
AGGCGCCACA 
CTTGTCTGGG 
CTTGGGGGCA 
TCCTGCCTTG 
AGGGGTGAGA 
TGGGGGCTAG 
GTCTCCAGGC 
OCTGGACAGC 
CTGGCGCTAC 
TG6CCAAI3GG 
GGTCTOCACC 
AGGATGACCT 
CATGATATAA 
CAGAATTTCA 
ACCX3CAAAAG 
CTCCTCCTTC 
CAAAACCAGG 
GCTCTGGAAG 
CCCCTTTGGA 



PCTAJS02/12476 



1560 
1620 
1660 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2260 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 



Seq XD KO: 95 Protein sequence: 
Protein Accession « : NP_036233 . l 



MEAADASRSN 
KPGEGRSALP 
FAEKGDVRKS 
QKAVKSCLVC 
TCICYLCMPQ 
SFTTNEKAIL 
KQTREQUISI 
MRHVEKMCKA 
VRTSYQPSSP 
LKGYPSLMRS 



11 

I 

GSSPEARDAR 
AGNEWRRPII 
IFSESRKPTV 
QASFCELHLK 
EHKNHSTVTV 
EQE7FBDLVRD 
SOSVLFIiQEF 
DLSRNPIERN 
GRFTKETTQK 
QSPKAQPQTW 



21 
I 

SPSGPSGSIiE 
QFVESGDDKN 
SIMEPGETRR 
PHLEGAAFRD 
EEAKAEKETE 
LSKQKEEVRA 
GALMSNYSIiP 
HMENGGDHRY 
NFNNLYGTKG 
KSGKQTMLSH 



31 
I 

NGTKAD6XDA 



41 
1 



KSYPKADTGL 
HQLLEPIRDF 
I^SIiQKEQLQL 
ALSQRBQDAV 
PPLPTYHVLL 
VNNrmSFGG 
NYTSRVWEYS 
YRPFYVNKGM 



GKRSPYAGLQ 
FSRSKSGSEE 
EARKCPVHGK 
KIISIEDEAE 
DQVKVIMDAIi 
EOEGLGQSIiG 
EWSAPDTMXR 



51 
I 

AEGKSLGSAL 
LGAAKKPPVT 
VLCDSCIGNK 
TMELFOQTOQ 
KWQKEKDRIK 
DERAKVLHE© 
NFKDDLLNVC 
YSMYLTPKGG 
FVVQGSSSFS 



GIGSNEAP 



60 
120 
180 
240 
300 
360 
420 
480 
540 



Seq ID NO: 96 DHA sequence 

Mucleic Acid Accesaioxi #& NM_0 80 668.1 

Coding sequence: 83-841 

1 11 21 31 41 51 

GGCACGAGOG CAGCXSAGTOG CCTTCCCGGT TGGCGGGOGC COGGGGCGGC GGOGCTGGAG 
GAGCTCGAGA CQGAGCCTAG TTATGTCTGG GAGGCGAAOG CGCTCCX3GAG GAGCCXSCTCA 
GCGCTCOGGG CCAAGGGCCC CATCTCCTAC TAAGCCTCTG CGGAGGTCXX AGOSGAAATC 
AGGCTCIGAA CTCCCGAGCA TCCTCCCTGA AATCTGGCCG AAGACACCCA ^^^^ 
AGTCAGAAAG CCCATCGTCT TAAAGAGGAT OGTGGCCCAT GCTGTAGAGG TCCCAGCTGT 
CCAATCACCT CGCAGGAGCC CTAGGATTTC CTTTTTCTTG GAGAAAGAAA AOGAGCCCCC 
TX3GCAGGGAG CTTACTAAGG AGGACCTTTT CAAGACACAC AGCGTCCCTG CCACCCCCAC 
CAGCACTCCT GTGCCGAACC CTGAGGCCGA GTCCAGCTCC AAGGAAGGAG AGCTGGACGC 
CAGAGACTTC GAAATGTCTA AGAAAGTCAG GCGTTCCTAC AGCCGGCTGG AGACCCTGGG 
CTCTGCCTCT ACCTCCACCC CAGGCCGCC6 GTCCTGCTTT GGCTTCGftGG GGCTGCTGGG 
GGCAGAAGAC TTGTCCGGAG TCTCGCCAGT GGTGTGCTCC AAACTCACOG AGGTCCOCAG 
GGTTTGTGCA AAGCCCTGGG CCCCAGACAT GACTCTCCCT GGAATCTCCC CACCACCCGA 
GAAACAGAAA OGTAAGAAGA AGAAAATGCC AGAGATCTTG AAAACGGAGC TGGATGAGTG 
GGCTGCGGCC ATGAATCOCG AGTTT6AAGC TGCTGAGCAG TTTGATCTCC TGGTTGAATG 
AGATGCAGTG GGGGGTOCAC CTGGOCAGAC TCTCCCTCCT GTCCTGTACA TAGCCACCTC 
CCTGTGGA6A GGACACTTAG GGTCCCCTCC CCTGGTCTTG TTACCTGTGT GTGTGCTGGT 
GCTGCGCATG AGGACTGTCT GCCTTTGAGG GCTIGGGCAG CAGOGGCAGC CATCTTOGTT 
TTAGGAAATG GGGCCGCCTG GCCCAGCCAC TCACTGGTGT CCTGTCTCTT GTOCTCCTOT 
CCTTCCTATC TCCCCAAAGT ACCATAGCCA GTTTCCAGAT GGGCCACAGA CTGGGGAGGA 
GAATCAGTGG CCCAGCCAGA AGTTAAAGGG CTGAGGGTTG AGGTGA6AGG CACCTCTGCT 
Cri-GTTGGGA GGGGTGGCTG CTTCGAAATA GGCCCaGGGG CTCTGCCAGC CTCCGCCTCT 
CCCTCCTCAG TTGCCTTCTG TTGGTGGCTT TCTTCTTGAA CCCACCTGTG TAAAGAGGTT 
TTCAGTTCCG TGGGTTTCOC CTTTCATTCT GTAAATAGTC CCAGAGAGAA TTOGT^GCT 
GAGGGCAATT CTGTCTTGGA GGAAGAAGCT GGACATTCAG CCTGTQGAGT CTGAGTTTTG 
AAGGATGTAG GGAGCCTTAG TTGGGTCTCA GACCATAAGT GTGTACTACA CAGAAGCTCT 
GTTTTCTAGT TCTGGTCTGC TGTTGAGATG TTTGGTAAAT GCCAGGTTGA TAGGOOGC TG 
GCTGCTTGGA GCAAAGGGTG CATTTCAGGG TGTGGCCACC AGGTGCTGTG AGTTTCTGTG 
GCTCATGGCC TCTQGOCTOQ TCCCTTCCAC AQGGCCCACG CTGGACrrCTT ACCACTCTGC 
TGCAGGGGTG GAAGGTGGCC CCTCTTGTCA CCCATACCCA TTTCTTACAA AATAAGTTAC 
ACCGAGTCTA CTTGGCCCTA GAAGAGAAAG TTGAAGAGTC CCAGACCTAC TAGCATTTTG 
CAACTATGCT TCTAAAGTCC TCGGAAAGTT TCCTCGCGTA CCAGACAGOG GCGGGGGCTG 
ATAGCAATTT TAGTTrTTGG CCTCCCTATC CTCTCACATG AGAACACTGC CTGGATGCAT 
CTCATGATCT CIOGAGAATT TCCOCATCTT TCTCTTCTTT CCATCGTGTG GATTCAATAG 
TTTG6ATTTG AAGGCTGCCC TGCCCCGGAC TCTCCTGCCG CACCCCTGGC CATTGTACCT 
TTTGATGTTT AGAAGTTCGT GGAAGTAGAC GCTGAGGTGT GCAGAGGAGC TGGTGG^ 
CAGAGAATGC CAGGGAAGAT GAGTGCTCGG TCAGGGTACT TGGATGAAAC ^^^^^ 
AGGCGGGCOC TAATAAAACC CTCTCCCAGG TCTGGGJW3TC CCAGGCCATC TGCTCAACGC 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 
1020 
1080 ■ 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
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TCTOTGGrrr GTCAGACCTG CAAGCAACOC CCXTGCT GG G GAW3CCTWQG TGTCCTTGflfi 2280 

CTGAACOGCA CTCAAGAACT CTTGTCCTCA CTGGCTGATG CAGCAGAACT CTTGGGAAAT 2340 

GTCTTACTCC TGC»GAATCA GGAGTCACCA GATGATGCAG AGTTGACATC ATCATTGCSUV 2400 

AGTTCICTCT TCCTCAGGAA CTAAATTTAA GGAAAAAATG OGATTTroTT TTACAGTTGG 2460 
AAAAAAACCC TGATT AAAGA GTITCTG C CT GTTAAAAAAA AAAAAAAAAA AAAAAA 

Seq ID NO: 97 Protein sequence: 
Protein Accession *: HP_542399.1 

1 11 21 31 41 51 

MSGSRTRSGG LqRSGPRAP SPTKPLRRSQ RKSGSELPSI LPSIWPKTPS AAAVRKPIVL 60 

KRIVAHAVEV PAVQSPRRSP RISFFLEKEN EPPGRBLTKE DIiFXIHSVPA TPTSTPVPNP 120 

BAESSSKEGE LDARDLEKSK KVRaSYSIlLE TDGSASTSTP GaBSCPGFEG LIX5AEDLSGV 180 

SPVWCSKLTE VPRVCAKPWA PEMTLPGISP PPEKQKRKKK KHPErUCTEI. DEMAAAKNAB 240 
FEAAEQFDLL VE 

Seq ID NO: 98 ONA sequence 

Nucleic Acid Accession ft: Eos sequence 

Coding sequence: SB-12444 

I 11 21 31 41 51 

OGGGCATTTC CGGGTCCGGG CCGAGCGGGC GCACGCGOGG GAGGGGGACT OGGOGGCATG 60 

GCGGGCTCCG GAGCCGGTGT GOOTTGCTCC CTGCTGCGGC TGCAGGACAC CTTGTCCGCT 120 

GCGGACCGCT GCGGTGCTGC CCTGGCXXKST CRTCAACTGA TCCGCGGCCT GGGGCft^ AA 180 

TGCGTCCTGA GCAGCAGCCC CGCGGTGCTG GCATTACAGA CATCTTTAGT TTTTTCCAGA 240 

GATTTCGCTT TCCTTGTATT TGTCCGGAAG TCACTCAACA GTATTGAATT TCGTGAATGT 300 

AGAGAAGAAA TCCTAAAGTT TTTATGTATT TTCTTAGAAA AAATGGGOCA GAAGATOSCA 360 

tCTTACTCTG TTCAAATTAA GAACACTTGT ACCAGTGTTT ATACAAAAGA TAGAGCTGCT 420 

AAATGTAAAA TTCCAGCCCT GGACCnCTT ATTAAGTTAC TTCAGACTTT TAGAAGTTCT 480 

ACACTCATGG ATGAATTTAA AATTGGAGAA TTATTTAGTA AATTCTATCG AGAACTTGCA 540 

TTGAAAAAAA AAATACCAGA TACAGTTTTA GAAAAAGTAT A TGAGCT CCT AG GATTATT G 600 

GGTGAAGTTC ATCCTAGTGA GATGATAAAT AATGCAGAAA ACCTGTTCaS CGv,-i.-AVrLT6 660 

GGTGAACTTA AGACCCAGAT GACATCAGCA GTAAGAGAGC CCAAACTAOC TGTTCTGGCA 720 

GGATGTCTGA AGGGGTTGTC CTCACTTCTG TGCAACTTCA CTAAGTCCAT GGAAGAAGAT 780 

CCCCAGACTT CAAGGGAGAT rPTTAATTTT GTACTAAAGG CAATTCGTCC TCAGATTGAT 840 

CTGAAGAGAT ATCCTGTGCC CTCAGCTGGC TTGCGCCTAT TTGCCCTGCA TGCATCTCAG 900 

TTTAGCACCT GCCTTCTGGA CAACTACGTG TCTCTATTTG AAGTCTTGTT AAAGTGGTGT 960 

GCCCACACAA ATGTAGAATT GAAAAAAGCT GCACTTTCAG CCCTGGAATC CTTTCTGAAA 1020 

CAGGTTTCTA ATATGGTGGC GAAAAATGCA GAAATGCATA AAAATAAACT GCAGTACTTT 1080 

ATGGAGCA6T TTTATGGAAT CATCAGAAAT GTGGATTCGA ACAACAAGGA GTTATCTATT 1140 

GCTATCCGTG GATATGGACT TTTTGCAGGA COGTGCAAGG TTATAAACGC AAAAGATGTT 1200 

GACTTCATGT AOGTTGAGCT CATTCAGCGC TGCAAGCAGA TCTTCCTCAC CXSVGACAGAC 1260 

ACTGGTGACG ACOGTCTTTA TCAGATGCCA AGCTTCCTCC AGTCrGTTGC AAGCGTCTTG 1320 

CTGTACCTTG ACACAGTTCC TGAGGTGTAT ACTCCAGTTC TGGAGCACCT OGTGGTGATG 1380 

CAGATAGACA GTTTCCCACA GTACAGTCCA AAAATGCAGC TGGTGTGTTG CAfiAGCCATA 1440 

GTGAAGGTGT TCCTAGCTTT GGCAGCAAAA GGGCCAGTTC TCAGGAATTG aVTTAGTACT ISOO 

GTGGTGCATC AGGGTTTAAT CAGAATATGT TCTAAACCAG TGGTCCTTCC AAAGGGCCCT 1560 

GAGTCTGAAT CTGAAGACCA CCGTGCTTCA GGGGAAGTCA GAACTGGCAA ATGGAAGGTG 1620 

CCCACATACA AAGACTACGT GGATCTCTTC AGACATCTCC TGAGCTCTGA CCAGATGATG 1680 

GATTCTATTT TAGCACATGA AGCATTTTTC TCTGTGAATT CCTCCAGTGA AAGTCTGAAT 1740 

CATTTACTTT ATGATGAATT TGTAAAATCC GTTTTGAAGA TTQTTGAGAA ATTGGATCTT 1800 

ACACTTGAAA TACAGACTGT TGGGGAACAA GAGAATGGAG ATGAGGCGCC TGGTGTTTGG 1860 

ATGATCCCAA CTTCAGATCC AGOGGCTAAC TTGCATCCAG CTAAACCTAA AGATTTTTCG 1920 

GCTTTCATTA ACCTGGTGGA ATTTTGCAGA 6AGATTCTCC CTGAGAAACA AGCAGAATTT 1980 

TTTGAACXaVT GGGTCTACTC ATTTTCATAT GAATTAATTT TGCAATCTAC AAGGTTGCCC 2040 

CTCATCAGIXS GTTTCTACAA ATTGCTTTCT ATTACAGTAA GAAATGCCAA GAAAATAAAA 2100 

TATTTCGAGG GAGTTAGTCC AAAGAGTCTG AAACACTCTC CTGAAGACCC AGAAAAGTAT 2160 

TCTTCCTTTG CrrTATTTGT GAAATTTGGC AAAGAGGTGG CAGTTAAAAT GAAGCAGTAC 2220 

AAAGATGAAC TTTTGGCCTC TTGTTTGACC TTTCTTCTGT CCTTGCCACA CAACATCATT 2280 

GAACTOSATG TTAGAGCCTA CGTTCCTGCA CTGCAGATGG CTTTCAAACT GGGCCTGAGC 2340 

TATACCCCCT TGGCAGAAGT AGGCCTGAAT GCTCTAGAAG AATGGTCAAT TTATATTGAC 2400 

AGACATGTAA TGCAGCCTTA TTACAAAGAC ATTCTCCCCT GCCTGGATGG ATACCTGAAG 2460 

ACTTCAGCCT TGTCAGATGA GACCAAGAAT AACTGGGAAG T6TCAGCTCT TTCT QCGG CT 2520 

GCCCAGAAAG GATTTAATAA AGTGGTGTTA AAGCATCTGA AGAAGACAAA GAACCTTTCA 2580 

TCAAACGAAG CAATATCCTT AGAAGAAATA AGAATTAGAG TAGTACAAAT GCTTGGATCT 2640 

CTAGGAGGAC AAATAAACAA AAATCTTCTG ACAGTCACGT CCTCAGATGA GATGATGAAG 2700 

AGCTATGTGG CCTSCGGhCPiG AGAGAAGCGG CTGAGCTTTG CAGTGCCCTT TAGAGAGATQ 2760 

AAACCTGTCA TTTTCCTGGA XGTGTTOCTG CCTCGAGTCA CAGAAtTAGC GCTCACAGCC 2820 

AGTGACAGAC AAACTAAAGT TGCAGCCTGT GAACTTTTAC ATAGCATGGT TATGTTTATG 2880 

TTGGGCAAAG CCAOGCAGAT GCCAGAAGGG GGACAGGGAG CCCCACCCAT GTACCAGCTC 2940 

TATAAGCGGA CGTTTCCTGT GCTGCTTCGA CTTGCGTGTG ATGTTGATCA GGTGACAAGG 3000 

CAACTGTATG AGCCACTAGT TATGCAGCTG ATTCACTGGT TCACTAACAA CAAGAAATTT 3060 

GAAAGTGAGG ATACTGTTGC CTTACTAGAA GCTATATTGG ATCGAATTGT GGACCCTGTT 3120 

GACAGTACTT TAAGAGATTT TTGTGGTCGG TGTATTCGAG AATTCCTTAA A TGGTC CATT 3180 

AAGCAAATAA CACCACAGCA GCAGGAGAAG AGTCCAGTAA ACACCAAATC 6CTTTTCAAG 3240 

OGACTTTATA GCCTTGOGCT TCACCXXaAT GCTTTCAAGA GGCTGGGAGC ATCACT TOCC 3300 

TTTAATAATA TCTACAGGGA ATTCAGGGAA GAAGAGTCTC TGGTGGAACA GTTTGTGTTT 3360 

GAAGCCTTGG TGATATACAT GGAGAGTCTG GCCTTAGCAC ATGCAGATGA GAAGTCGITA 3420 

GGTACAATTC AACAGTCTTG TGATGCCATT GATCACCTAT GCCGCATCAT TGAAAAGAAG 3480 

CATGTTTCTT TAAATAAAGC AAAGAAAGGA OGTTTGCCGC GAGGATTTCX: ACCTTCCGCA 3540 

TCATTGTGTT TATTGGATCT GGTCAAGTGG CTTTTAGCTC ATTGTGGGAG GCCCCAGACA 3 600 

GAATGTCGAC ACAAATCCAT TGAACTCTTT TATAAATTCG TT CCTT TATT GCCA GGCAAC 3660 

AGATCCCCTA ATTTGTGGCT GAAAGATGTT CTCAAGGAAG AAGGTGTCTC TTTTCTCATC 3720 

AACACCTTTG AGGGGGGTGG CTGTGGCCAG CCXTTCGGGCA TCCTGGCOCA GCCCACCCTC 3780 

TTGTACCTTC GGGGGCCATT CAGCCIGCAG GCCACGCTAT GCTOGCTGGA OCTGCTCCTG 3840 
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GCaSCGTTGG AGTGCTACAA GACGTTCATT GGCGAGAGAA CrCTAQCSAGC GCTCCaGGTC 3900 

CTAGGTACTG AAGCCCAGTC TTCACrTTTG AAAGCAGTGG CTTTCTTCTT AGAAAGCATT 3960 

GCCATGCATC ACATTATAGC AGCAGAAAAG TGCTTTGGCA CTGGGGCAGC AGGTAACACA 4020 

ACAAGCCCAC AAGAGGGAGA AAGGTACAAC TACAGCAAAT GCACCGTTGT GGTCCGGATT 4080 

ATGGACTTTA CCAOGACTCT GCTAAACACC TCCCCGGAAG GATGGAACCT CCIGAAGAAG 4140 

GACTTGTGTA ATACACACCT GATGAGAGTC CIGGTGCAiSA OGCTOTGTGA GCCCGCAAGC 4200 

ATAGGTTTCA ACATCGGAGA CGTCCAGGTT ATGGCTCATC TTCCTGATGT TTGTGTGAAT 4260 

CTGATCAAAG CTCTAAAGAT GTCCCCATAC AAAGATATCC TAGftGACCCA TCTGAGAGAG 4320 

AAAATAACAG CACAGAGCAT TGAGGAGCTT TGTGCOGTCA ACrPGTATGG CCCTGAOGOS 4380 

CAAGTGGACA GGACCAGGCT GGCTGCTGTT GTGTCTGCCT GTAAACAGCT TCACAGA6CT 4440 

GGGCTTCTCC ATAATATATT ACCGTCTCAG TCCACAGATT TGCATCATTC TGTTGGCACA 4500 

GAACTTCTTT CCCTGGTTrA TAAAGGCATT GCXXXTGGAG ATGAGAGACA GTGTCTGCCT 4560 

TCTCTAGACC TCAGTTCTAA GCAGCTGGCC AGOGGACTTC TGGAGTTAGC CTTTGCTTTT 4620 

GGAGGACTGT GTGAGCGCCT TGTGAGTCTT CTCCTGAACC CAGOGGTGCT GTCCACCGC6 4680 

TCCTT6GGCA GCTCACAGGG CAGCGTCATC CACTTCTCCC ATGGGGAGTA TTTCTATAGC 4740 

TTGTTCTCAG AAACGATCAA CACGGAATTA TTGAAAAATC TGGATCTTGC TGTATTGGAG 4800 

CTCATGCAGT CTTCAGTGGA TAATACCAAA ATGGTGAGTG CCGTTTTGAA CGGCATGTTA 4860 

GACCAGAGCT TCAGGGAGCG AGCAAACCAG AAACACCAAG GACTGAAACT TGOGACTACA 4920 

ATTCIGCAAC ACTGGAAGAA GTGTGATTCA TGGTGGGCCA AAGATTCCCC TCTCG AAACT 4980 

AAAATGGCAG TGCTGGCCrT ACTGGCAAAA ATTTTACAGA TTGATTCATC TGTATCTTTT 5040 

AATACAAGTC ATGGTTCATT CCCTGAAGTC TTTACAACAT ATATTAGTCT ACTTGCTGAC 5100 

ACAAAGCTGG ATCTACATTT AAAGGGCCAA GCPGTCACTC TTCTTCCATT CTTCACCAGC 5160 

CTCACTGGAG GCAGTCTOQA GGAACTTAGA OGTGTTCTGG AGCAGCTCAT OSTTGCTCAC S220 

TTCCCCATGC AGTCCAGGGA ATTTCCTCCA GGAACTCCGC GGTTCAATAA TTATGTGGAC S280 

TGCATGAAAA AGTTTCTAGA TGCATTGGAA TTATCTCAAA GCCCTATGTT GTTGGAATTG 5340 

ATGACAGAAG TTCTTTGTCG GGAACAGCAG CATGTCATQG AAGAATTATT TCAATCCAGT S400 

TTCAGGAGGA TTGCCAGAAG GGGTTCATGT GTCACACAAG TAGGCCTTCT GGAAAGOCTG 5460 

TATGAAATGT TCAGGAAGGA TGACCCCCGC CTAAGTTTCA CACG CCAGTC CTTTGTGGAC 5520 

CGCTCCCTCC TCACTCTGCT GTGGCACTGT AGCCTGGATG CTTTGAGAGA ATTCTTCAGC 5580 

ACAATTGTGG TGGATGCCAT TGATXrrGTTG AAGTCCAGGT TTACAAAGCT AAATGAATCT 5640 

ACCTTTOATA CTCAAATCAC CAAGAAGATG GGCTACTATA AGATTCTAGA CGTGATGTAT 5700 

TCT C GOCTTC CCAAAGATtSA TGTTCATGCT AAGGAATCAA AAATTAATCA AGTTTTCCAT 5760 

GGCTCGTGTA TTACAGAAGG AAATGAACTT ACAAAGACAT TGATTAAATT GTGCTACGAT 5820 

GCATTTACAG AGAACATGGC AGGAGAGAAT CAGCTGCTGG AGAGGAGAAG ACTTTACCAT 5880 

TGTGCAGCAT ACAACTGCGC CATATCTGTC ATCTGCTGTG TCTTCAATGA GTTAAAATrT 5940 

TACCAAGGTT TTCTGTTTAG TGAAAAACCA GAAAAGAACT TGCTTATTTT TGAAAATCTG 6000 

ATOGAOCTGA AGGGCCGCTA TAATTTTCCT GTAGAAGTTG AGGTTCCTAT GGAAAGAAAG 6060 

AAAAAGTACA TTGAAATTAG GAAAGAAGOC AGAGAAGCA6 CAAATGGGGA TTCAGATGGT 6120 

CCTTCCTATA TGTCTTCCCT GTCATATTTG GCAGACAGTA CCCTGAGTGA QGAAATGAGT 6180 

CAATTTGATT TCTCAACCGG AGTTCAGAGC TATTCATACA GCTCCCAAGA CCCTAGACCT 6240 

GCCACTGGTC GTTTTCGGAG ACGGGAGCAG CGGGACCCCA CGGTGCATGA TGATGTGCTG 6300 

GAGCTGGAGA TGGACGAGCT CAATCGGCAT GAGTGCATGG CGCCCCTGAC GGCCCTGGTC 6360 

AAGCACATCC ACAGAAGCCT GGGCCCGCCT CAAGGAGAAG AGGATTCAGT GCCAAGAGAT 6420 

CTTCCTTCTT GGATGAAATT CCTCCATGGC AAACTGGGAA ATCCAATAGT ACCATTAAAT 6480 

ATCCGTCTCr TCTTAGCCAA GCTT G TT A TT AATACAGAAG AGGTCTTTCG CCCTTACGCG 6540 

AAGCACTGGC TTAGCCCCTT GCTGCAGCTG GCTGCTTCTG AAAACAATGG AGGAGAAGGA 6600 

ATTCACTACA TGGTGGTTGA GATAGTGGCC ACTATTCTTT CATGGACAGG CTTGGCCACT 6660 

CCAACAGGGG TCCCTAAAGA TGAAGTGTTA GCAAATCGAT TGCTTAATTT CCTAATGAAA 6720 

CATGTCTTTC ATCCAAAAAG AGCTGTGTTT AGACACAACC TTGAAATTAT AAAGAQXTT 6780 

GTOSAGTGCT GGAAGGATTG TTTATCCATC CCTTATAGGT TAATATTTGA AAAGTTTTCC 6840 

GGTAAAGATC CTAATTCTAA AGACAACTCA GTAGGGATTC AATTGCTAGG CATCGTCATG 6900 

GCCAATGACC TGCCTCCCTA TGACCCACAG TGTGGCATCC AGAGTAGCGA ATACTTCCAG 6960 

GCTTTGGTGA ATAATATGTC CTTTGTAAGA TATAAAGAAG TGTATGCCGC TGCAGCAGAA 7020 

GTTCTAGGAC TTATACTTCG ATATGTTATG GAGAGAAAAA ACATACTGGA GGAGTCTCTG 7080 

TGTGAACTGG TTGCGAAACA ATTGAAGCAA CATCAGAATA CTATGGAGGA CAAGTTTATT 7140 

GTGTGCTTGA ACAAAGTGAC CAAGAGCTTC CCTCCTCTTG CAGACAGGTT CATGAATGCT 7200 

GTGTTCTTTC TGCTGCCAAA ATTTCATGGA GTGTTGAAAA CACTCTGTCT GGAGGTGGTA 7260 

CTTTGTCGTG TGGAGGGAAT GACAGAGCTG TACTTCCAGT TAAAGAGCAA GGACTTCGTT 7320 

CAAGTCATGA GACATAGAGA TGATGAAAGA CAAAAAGTAT GTTTGGACAT AATTTATAAG 7380 

ATGATGCCAA AGTTAAAACC AGTAGAACTC CGAGAACTTC TGAACCCCGT TGTGGAATTC 7440 

CTTTCCCATC CTTCTACAAC ATGTAGGGAA CAAATGTATA ATATTCTCAT GTGGATTCAT 7500 

GATAATTACA GAGATCCAGA AAGTGAGACA GATAATGACT CCCAGGAAAT ATTTAAGTTG 7560 

GCAAAAGATG TGCTGATTCA AGGATTGATC GATGAGAACC CTGGACTTCA ATTAATTATT 7620 

C6AAATTTCT GGAGCCATGA AACTAG6TTA CCTTCAAATA CCTTGGACCG GTTGCT GGCA 7680 

CTAAATTCCT TATATTCTCC TAAGATAGAA GTGCACTTTT TAAGTTTAGC AACAAATTTT 7740 

CrCCPOGAAA TGACCAGCAT GAGCCCAGAT TATCCAAACC CCA TGTT CGA GCATOCTCTG 7800 

TCAGAATGCG AATTTCAGGA ATATACCATT GATTCTGATT GGCGTTTCCG AAGTACTGTT 7860 

CTCACTCCGA TGTTTGTGGA GACCCAGGCC TCCCAGGGCA CTCTCCAGAC CCGTACCCAO 7920 

GAAGGGTCCC TCTCAGCTCG CTGGCCAGTG GCAGGGCAGA TAAGGGCCAC CCAGCAGCAG 7980 

CATGACTTCA CACTGACACA GACTGCAGAT G6AAGAAGCT CATTTGATTG GCT GACCG GG 8040 

AGCAGCACTG ACCOGCTGGT OGACCACACC AGTCCCTCAT CTGACTCCTT GCT GTTTGC C 8100 

CACAAGAGGA GTGAAAQGTT ACAGAGAGCA CCCTTGAAGT CAGTGGGGCC TGATTTTGGG 8160 

AAAAAAAGGC TGGGCCTTCC AGGGGAOGAG GTGGATAACA AAGTGAAAG6 TGCGGCCGGC 8220 

CGGACGGACC TACTACGACT GCGCAGACGG TTTATGAGGG ACCAGGAGAA 6CTCAGTTTG 8280 

ATCTATGCCA GAAAAGGCGT TGCTGAGCAA AAAOGAGAGA AGGAAATCAA GAG TGAGTTA 8340 

AAAATGAAGC AGGATGCOCA GGTCGTTCTG TACAGAAGCT ACCGGCAOGG AGACCTTCCT 8400 

GACATTCAGA TCAAGCACAG CAGCGTCATC ACCCOGTTAC AGG CCGTGG C CCAGAQGGAC 8460 

CCAATAATTG CAAAACAGCT CTTTAGCAGC TTGTTTTCTG GAATTTTGAA AGAGATQGAT 8520 

AAATTTAAGA CACTGTCIX5A AAAAAACftAC ATCACTCAAA AGTTGCTTCA AGACTTCAAT 8580 

CGTTTTCTTA ATACCACCTT CTCTTTCTrr CCACCCTTTG TCTCTTGTAT TCAGGACATT 8640 

AGCTGTCAGC ACGCAGCCCT GCTGAGCCTC GACCCAGCGG CTGTTAGCGC TGGTTGCCTG 8700 

GCCAGCCTAC AGCAGCCOGT GGGCATCOGC CTGCTAGAGG AGGCTCTGCT COGCCTGCTG 8760 

CCTGCTGAGC TGCCTGCCAA G0GAGTCC3GT GGGAAGGCCC GCCTCCCTCC TGATGTCCTC 8820 

AGATGGGTGG AGCTTGCTAA GCTGTATAGA TCAATTGGAG AATAOGACGT CCTCCGTGGG 8880 

ATTTTTACCA GTGAGATAGG AACAAAGCAA ATCACTCAGA GTGCATTATT AGCAGAAGCC 8940 

AGAAGTGATT ATTCTGAAGC TGCTAAGCAG TATGATGAGG CTCTCAATAA ACAAGA C TG6 9000 

GTAGATGGTG AGOCCACAGA AGCCGAGAAG GATTTTTGGG AACTTGCATC CCTTGACTGT 9060 
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TACAACCACC TPQCTGAGTG GAAATCACTT 
GACAACCCCC CAGACCTAAA TAAAATCTGG 
CXTTTACATGA TCOGCAGCAA GCTGAAGCTG 
CTGAOlTTrA TTGACAAAGC TATGCAOGGG 
5 TACAGTCAAG AGCTGRGTCT GCTTTAOCTC 
TACATTCAAA ATGGCATTCA GAGTmATG 
CACCAAAGTA GACTCACCAA ATTGCAGTCT 
ATCAGCTTTA TAAGCAAACA AGGCAATTTA 
AACACCTGGA CAAACAGATA TCCAGATGCT 

10 ATCATCACAA ATOGATGTTT CTTTCTCACC 
GAAGATAATA GTATGAATGT GGATCAAGAT 
GAGCAGGAAG AAGATATCAG CTCCCTGATC 
ATGATAGACA GTGCCOGGAA GCAGAACRAT 
CTGCATAAAG AGTCAAAAAC CAGAGACGAT 

15 CGCCTGAGCC ACTGCCGGAG CCGGTCCCAG 
AAAACAGTCT CTrTGTTGGA TGAGAACAAC 
GCTTTCOGTG ACCAGAACAT TCTCTTGGGT 
AGCAGTGAGC CftGCCT G OCT TGCTGAAATC 
CTTTCTGGAT CCAGTTCAGA G6ATTCAGAG 

20 TTCCAGCACC TCTCTGAGGC TGTGCAGGCG 
AGCTGTGGGC CTGCAGCTGG GGTGATTGAT 
CAACAGCTGC GCAAGGAGGA AGAGAATGCA 
TATCCRGCAC TTOTGGTG G A GAAAATGTTG 
AGATTGAAGT TTCCTAGATT ACTTCAGATT 

25 CTCATGACAA AAGAGATCTC TTCCGTTCCC 

atggtggcct tactggacaa agac caagc c 
actgataact acccgcaggc tattgtttat 
ttcaaggata cttctactgg tcataagaat 
ttggatcaag gaggagtgat tcaagatttt 

30 gaacigctct ttaaggattg gagcaatgat 
aataaaaaaa acattgaaaa aatgtat6aa 
gctocaggcc tgggggcctt tagaaggaag 
aaacattttg ggaaaggagg ttctaaacta 
attaccaaca tgctactttt aaaaatgaac 

35 gaatgttcac cctggatgag cxy^cttcaaa 
cccggtcagt atgaoggtag gggaaagcca 
tttgatgagc gggtgacagt catggcgtct 
ggccatgacg agagggaaca ccctttcctg 
cagcgogtgg agcacctctt ccaggtcatg 

40 agcx:agaggg ccctgcagct gaggacctat 

TTAATTGAGT GGCTTGAAAA TACTGTTACC 

caagaggaga aggcggctta cctgagtgat 
tggctgacaa aaatgtcagg aaaacatgat 

GCTAATCGTA CTGAAACAGT CACX5TCTTTT 

45 CTCTTAAAGC GGGCCTTCGT GAGGATGAC5T 
TCCCACTTCG CCAGCTCTCA CGCTCTGATA 
GACAGACATC TGAACAACTT TATGGTGGCC 
TTTGGGCATG OGTTTGGATC CGCTACACAG 
GGGCTAACTC GCCAGTTTAT CAATCTGATG 

50 AGCATCATGG TACACGCACT CCGGGC CTTC 
ATGGATGTGT TTGTCAAGGA GCCCTCCTTT 
AAAAAAGGAG GGTCATGGAT TCAAfiAAATA 
CAGAAAATAT GTTACGCTAA GAGAAAGTTA 
GATGAGCTAC TCCTGGGTCA TGAGAAGGCC 

55 G6RGGAAGCA AAGATCACAA CATTCGTGCC 
ACTCAAGTGA AGTGCCTGAT GGACCAGGCA 
GAAGGATGGG AGCCCTGGAT GTGAGGTCT6 
GTTTAAAGAA TCTACTATAC TTTGGTTGGC 
CTAAAGAGAA ATGTCTTTTG TGCTACAGTT 

60 TCAGTAAATG TGTATGGGTT AAATCAAAGA 
AGGTTTATAG AAAGATAGAT ATCCAGGCTT 
TGATCAGCTT TCAAAGCATT TACAAGTGCT 
TGGAGGAAAT GTGGGGAAGC CTTGGAATGC 
CTCAGAAGGC TTCATCACCA AGATTTTGGG 

65 TTGTAGAAGC AGCATAGGAA CAATAAGAAC 
TTAGAAATGA CTGCATTTGA TATTTTAGGA 
TTCTCTTCTA GTTTTGACAT TTTATGATAG 
TAGGAGGGCA AAAATTTTGG TCATAG CATT 
GATACATAAA AGTGCTTTGC ATTGAATTTG 

70 TAGGGATAGT ACTAAGCATT TC3VGTTCCAG 
ATTCCTCATT TGGAGGAAAA AAAGCATGCA 
ACAAAAGTGG CTCCTTCCCA TGTGCAGTCC 
AACTGTTTCT GATTGGCTTT TAGCTTTTTG 
TTTGGAGGCT CTTCTGTGAT TTTGAGAAGT 

75 CCAAAAGTA 

80 
85 
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GAATACTGTT 
AGTGAACCAT 
CTGCTCCAGG 
6AGCTCCAGA 
CTGCAAGATG 
CACAATTATT 
GTACaVGGCTT 
TCATCTCAAG 
AAAATGGACC 
AAAATAGAGG 
GGAGACCCCA 
AGGAGTTGCA 
TTCTCACTTG 
TGGCTGGTGA 
GGCTGCTCTG 
GTGTCAW3CT 
ACAACTTACA 
GAGGAGGACA 
AAGGT6ATCG 
GCTGAGGAGG 
6CTTACATGA 
TCAGTTATTG 
AAAGCTTTAA 
ATAGAACGGT 
TGCTG6CAGT 
GTTGCTGTTC 
CCCTTCATCA 
AAGGAGTTTG 
ATTAATGCCT 
GTAAGAGCTG 
AGAATGTATG 
TTTATTCAGA 
CTGAGAATXSA 
AAAGACTCAA 
GTGGAGTTCX: 
TTGCCAGAGT 
CTGCGAAGGC 
GTGAAGGGTG 
AATGGGATOC 
AGOGTTGTGC 
TTGAAGGACC 
CCCAGGGCAC 
GTTGGAGCTT 
A6AAAA0GAG 
AOUUSCCCTG 
TGCATCAGCC 
ATGGAGACTG 
TTTCTGCXAG 
TTACCAATGA 
CGCTCAGACC 
GATTGGAAAA 
AATGTTGCTG 
GCAGGTGCCA 
CCTGCCTTCA 
CAAGAACCAG 
ACAGACCCCA 
TGGGAGTCTG 
AGCATTCCAT 
TCGTAGCATG 
TAAGGTTATA 
ACCAAAGTAT 
GCAAGTTAGT 
CCTTCTGGTT 
AGAGTAAAGC 
AATAGGTAAA 
TATTTTTCTA 
ATTTGCTCTC 
CACTTTTGCT 
GGATAACTTC 
GAGAATAAAA 
TTCTAGCACA 
CTGTCCCCCC 
TTGTTTTTTT 
ATACTCTTGA 



CIACAG0C3\G 
TTTATCAOGA 
GAGAGGCTGA 
AGGCGATTCT 
ATCTTGACAG 
CTAGTATTGA 
TAACAGAAAT 
TTCXXXTTTAA 
CAATGAACAT 
AGAAGCTTAC 
GTGACAGGAT 
AGmrCCAT 
CTATGAAACT 
GCTGGGTGCA 
AGCAGGTGCT 
ACTTAAGCAA 
GGATCATAGC 
AGGCTAGAAG 
CGGGTCTGTA 
AGGGCCAGCC 
0GCTG6CAGA 
ATTCTGCAGA 
AATTAAATTC 
ATCCAGAGGA 
TCATCAGCTG 
AGCACTCIGT 
TAAGCAGOGA 
TGGCAAGGAT 
TAGATCAGCT 
AACTAGCAAA 
CaGCCTTGGG 
CTTTTOGAAA 
AGCTCAGTGA 
AGCCCCCTGG 
TGAGAAATGA 
ACCAC3GTGCG 
CCAAGOGCAT 
GGGAGGACCT 
T66CCCAAGA 
CCATGACCTC 
TTCTTTTGAA 
CGCCGTGTGA 
ACATGCTAAT 
AAAGTAAAGT 
AGGCTTTCCT 
ACTGGATCCT 
GOGGGGTGAT 
TCCCTGAGTT 
AAGAAACGGG 
CTGCCCTGCT 
ATTTTGAACA 
AAAAAAATTG 
ATCCA6CAGT 
GAGACTATGT 
AGAGTGGGCT 
ACATCCTTGG 
CAGATAGAAA 
GAGCTGATTT 
AGTTTAAATC 
GTAACATCAA 
TAAGTCAAGA 
GAAACAGCTG 
CTGGCACATT 
TAAGTATAGT 
GCTATAATTA 
GGTTTTTTCC 
TAGAAGGAAA 
ATTCCAATCT 
AAAAATCCCA 
GAAATTCCTA 
ACAAGATGAA 
COGCXaWSTCC 
TTTTOCTTCT 
GTGTTTAATA 



TATACACAGT 9120 
AACATATCTA 9180 
CCAGTCCCTG 9240 
AGAGCTTCAT 9300 
AGCCAAATAT 9360 
TGTCCTCTTA 9420 
TCAGGAGTTC 9480 
GAGACTTCTG 9S40 
CTOGGATGAC 9600 
CCCTCTTCCa 9660 
GGAAGTGCAA 9720 
GAAAATGAAG 97 BO 
ACTGAAOGAG 9840 
GAGCTACTGC 9900 
CACTGTGCTG 9960 
AAATATTCTG 10020 
GAATGCTCTC 10060 
AATCTTAGAG 10140 
CCAGAGAGCA 10200 
TCCCTCCTGG 10260 
TTTCTGTGAC 10320 
ACTGCAGGOG 10380 
CAATGAAGCC 10440 
GACTTTGAGC 10500 
GATCAGCCAC 10560 
OGAAGAAATC 10620 
AAGCTATTCC 10680 
TAAAAGTAAG 10740 
CTCTAATCCT 10800 
AACCCCTGTA 10860 
TGACCCAAAG 10920 
AGAATTTGAT 10980 
CTTCAACGAC 11040 
GAATCTGAAA 11100 
GCTGGAGATT 11160 
AATCGCCGGG 11220 
CATCATCCGT 11280 
GCXXSCAGGAC 11340 
CTCOGCCTGC 11400 
CAGGTTAGGA 11460 
CACXATGTCC 11520 
ATATAAAGAT 11580 
GTATAAGGGC 11640 
GCCTGCTGAT 11700 
GGCGCTCCGC 11760 
GGGGATTGGA 11820 
CGGGAT06AC 11880 
GATGCCTTTT 11940 
CCTTATGTAC 12000 
CACCAACACX: 12060 
GAAAATGCTG 12120 
GTACCCOCGA 12180 
CATTACTTGT 12240 
GGCTGTGGCA 12300 
TTCAGAAGAG 12360 
CAGAACCTGG 12420 
GCATTACATT 12480 
TCCTGAAACA 12540 
AAGATTATGA 12600 
AGATTAGGTG 12660 
ATATAATATG 12720 
TCTCCGTAAA 12780 
GGAAAGCACA 12840 
TGATGTAACA 12900 
TGGCTTATAT 12960 
TTTCATTTTA 13020 
CGTCTTTATT 13080 
ACAACTGGAA 13140 
TGGTTGTTGT 13200 
TTTGAAATGA 13260 
ATTATGGAAT 13320 
TCCACACCCA 13380 
AACACTTGTA 13440 
AA6TTTTTTT 13500 



Seq ID NO: 99 Protein sequence: 
Protein Accesfiion fix NP_008835.S 



1 

! 

MAGSGAGVRC 
SDFGLIjVFVR 
AKCKIPAU3L 
LGEVHPSEMI 
DPQTSRBIFN 
CAHTOVBLXK 



11 

i 

SLLRLQETLS 
KSUISIBFRB 
LIKIjLQTFRS 
NNAENLPRAF 
FVLKAIRPQI 
AALSALESPL 



21 31 41 51 

lilt 
AADROGAAIiA GHQLIRGLGQ ECVLSSSPAV UUiQTSLVPS 
CRBEILKFLC IFLEKMGQKI APYSVEIKNT CTSVYTKDRA 
SRLMDEPKIG ELFSKFYGEL ALKKKIPDTV LEKVYEI*U3L 
LGELKTQMTS AVREPKLPVL AGCLKGLSSL LQIPTKSME2 
DUCRVAVPSA GliRLFALHAS QPSTOiLDIlY VSLFEVLLKW 
KQVSNMVAKN AEMHKNKLQY FMBQPYGIIR NVDSHNKELS 



60 
120 
180 
240 
300 
360 
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lAIKGYGLFA GPCKVIKAKD VDFMYVBLIQ RC3CQNPLTQT DTGDDaWQM PSFLQSVASV 420 

liYLDTVPEV YTPVLKHLW NQIDSFPQYS PKMQLVCC8A IVKVFLALAft KGPVLRNCIS 480 

TWEQGLIRI CSKPWItPKG ?ESESEDHRA SGBVRTCBDfX VPTYKDYVDL FRHZtLSSDQM 540 

MDSIIiADEAF PSVNSSSESL NHLLYDEFVK SVLKIVEKLD LTLBIOTVGS QSIGDEAPGV 600 

K>aPTSDPAA NLH?AKPKDP SAPINLVBFC REILPEKQAE FFBPWVYSFS YELILQSTRL 660 

PLISGFYKIiL SITVRNAKKI KYPBCVSPKS LKHSPEDPEK YSCFALFVKF GKBVAVKMKQ 720 

YKDBUASCL TFLLSliPENI lELDVRAYVP ALQMAFKLGL SYTPLAEVGL HALE EWSIYI 780 

DRHVKQPYYK DILPCUJCYb RTSALSDSTR NSKBVSALSR AAQKGFNKW LXHLKKTXNZi 840 

SSNEAISLEE IRIRWQMLG SLQC3QINKNL LTVTSSDEMM KSYVAWDREK RLSEAVPFRE 900 

MKPVIFU)VF LPRVTELALT ASDRQTKVAA CELLHSMVMP MLGKATQMPE GGQGAPPMYQ 960 

LYKRTPPVI*L RliACDVDQVT RQLYBPLVMQ LIHWPTKXKK PBSQDTVALL EAILDGIVDP 1020 

VDSTLRDPOG RCIREPLKWS IKQITPQQQE KSPVNTKSLF KRLYSLAIiHP NAFKRI/3ASL 1080 

APKNIYRBFR EBESI.VEQFV FEALVrmES LAIiAHADEKS LGTIQQCCDA IDHLCRIIBK 1140 

KHVSIiKKAKK HRLPRGFPPS ASLCLLDLVK WLLAHOGRPQ TBCRHKSIEL FYKFVPLLPG 1200 

NRSPNLWLKD VLKEEGVSFL INTFBQGGCG QPSGILAQPT LLYLR6PFSL QATLOaLDIiIi 1260 

LAALECYNTF IGERTVGAIiQ VLGTEAQSSL LKAVAFFLES lAMHDIIAAE KCFGTGAAGN 1320 

RTSPQEGERY NYSKCTVWR IMETnTLLN TSPEGWKLLK KDLCNTHLMR VLVQTLCBPA 1380 

SIGFNIOJVQ VMAHLPDVCV JILMXAIiKMSP YKDILETHLR EKITAQSIEE UAVNLYGPD 1440 

AQVORSRIAA WSACKQLHR AGIiIiHMII»PS QSTDIiHHSVG TELLSLVYKG lAPGDERQCL 1500 

PSLDLSCKQL ASGLLELAFA PGGLCERLVS LLLNPAVLST ASLGSSQGSV IHFSHGEYFY 1560 

SLPSETINTB LUtNLDLAVL ELMQSSVmiT KHVSAVLMGM LDQSPRERAN QKHQGL KLAT 1620 

TILQHWKKCD SWWAKDSPLE TKMAVIALIA KILQIDSSVS PNTSHGSFPE VFTTYISLLA 1680 

OTKUJLHIiKG QAVTLLPFPT SLTGGSLEEI, RRVLEQLIVA SFPMQSREFP PGTPRPNNYV 1740 

DCMKKFLDAL ELSQSPMLLE LMTEVLCREQ QHVMBELPQS SFRRIARRGS CVTQVGLLES 1800 

VYEMFRKDDP RLSFTRQSFV DRSLLTLLWH CSLDALREPF STIWOAIDV LKSRFTKUJB 1860 

STFDTQITKK MGYYKIL0VM YSRLPKDDVH AKESKINQVF HGSCITBC2JB LTKTLIKLCY 1920 

DAFTENMAGB NQLLERRRLY HCAAYNCAIS VICCVFNBLK FYQCFLFSBK PEKNLLIFEN 1980 

LIDLiCRRYNF PVEVEVPMER KKKYIEIRKE AREyUUiGDSD GPSYMSSI.SY LADSTLSBEM 2040 

SQFDFSTGVQ SYSYSSQDPR PATGRFRRRE QRDPTVHDDV LBLEMDBUIR HEOttPLTAL 2100 

VKHMHRSLGP PQGEBDSVPR DLPSHMKFLH GKLGMPIVPL NTRLFLAKLV INTEEVFRPY 2160 

AKHMLSPLLQ LAASENNGGE GIHYMWEIV ATILSWTCLA TPTGVPKDEV liANRIiLNPLM 2220 

KHVPHPKRAV FRHNLEIIKT LVSCWKDCLS IPYRLIFBKF SGKDPNSKDN SVGIQU/GIV 2280 

MANDLPPYDP QCGIQSSEYF QALVNUMSFV RYKEVYAAAA EVLGLXLRYV MBRKNILEES 2340 

LCELVAKQLK QHQNTMBDKP IVCLNKVTKS FPPLADRFMN AVFPLLPKFH GVLKTLCLEV 2400 

VLCRVBGMTE LYFQLKSKDF VQVMRHRDDE RQKVCLDIIY KMHPiCLKPVE LREUiNPWE 2460 

FVSHPSTTCR EQMYNILMWI HDNYRDPESE TDKDSQBIFK IAKDVI.IQGL IDENPGLOLI 2520 

IRNFWSHETR LPSNTLDRLL ALKSLYSPKI BVHFLSIATN FLLEMTSMSP DYPNPMPEHP 2580 

liSECEFX^EYT IDSDWRFRST VLTPMFVETQ ASQGTLQTRT QEGSLSARWP VAGQIRATQQ 2640 

^FTLTOTA DGRSSPDWLT GSSTDPLVDH TSPSSDSLLF AHKRSERLQR APLKSVGPDP 2700 

GKKRIX5LPGD EVDNKVKGAA GRTDLLRLRR RPMRDQEKLS LMYARKGVAE QKREKEIKSE 2760 

LKMKQDAQW LYRSYRHGDL PDIQIKHSSL ITPLQAVAQR DPIIAKQLPS SLFSGILKBN 2820 

DKFKTLSEKN NITQKIiLQDF NRFLNTTFSF FPPPVSCIQD ISOQHAALLS LDPAAVSAGC 2880 

LASLQQPVGI RLLEEALLRL LPAELPAKRV RGKARLPPDV LRHVELAKLY RSIGEYDVLR 2940 

GIFTSEIGTK QITQSALLAE ARSDYSEAAK QYDBALNKQD WVDGEPTEAE KDFWEIASU) 3000 

CYNHLAEWKS LEYCSTASID SENPPDLNKI WSEPPYQETY LPYMIRSKLK liLQGEADQS 3060 

LLTFIDKAMH GELQKAILEL HYSQELSLLY LLQDDVDRAK YYIQNGIQSP MQNYSSIDVIi 3120 

LHQSRLTKLQ SVQALTEIQE FISFISKQGM LSSQVPLKRL UITWTiniYPD AKKDPMMIWD 3180 

DIITNRCFFL SKIEEKLTPL PEDNSMKVDQ DGDPSDRMEV QEQBEOISSIi IRSCKPSMKM 3240 

KMIDSARKQN NFSLAMKLLK ELHKESKTRD DWLVSWVQSY CRLSHCRSRS QGCSBQVLTV 3300 

LKTVSLLDEN NVSSYLSKNI LAFRDQNILL GTTYRIIANA LSSEPACIiAE lEEDKARRIL 3360 

ELSGSSSEDS EKVIAGLYQR AFQHLSEAVQ AAEBEAQPPS WSCGPAAGVI DAYMTLADFC 3420 

DQQLRKEEEH ASVZDSAELQ AYPALWEKM LKAUCUISNE ARLKFPRU/J IIERYPEETL 3480 

SLMTKBISSV PCWQFISWIS HMVALLDKDQ AVAVQHSVEB ITONYPQAIV YPFIISSESY 3540 

SFKDTSTGHK NKEFVARIKS KLDQGGVIQD FIMALDQLSN PELLFKDWSN DVRAELAKTP 3600 

VNKKNIEKMY ERMYAALGDP KAPGLGAPRR KFIQTFGKEP DIHPGKGGSK LLRMKLSDPN 3660 

DITNMLLLKM NKDSKPPGNL KECSPWMSDF KVEPIiRNELE IPGQYDGRGK PLPBYHVRIA 3720 

GFDERVTVMA SLRRPKRIII RGHDEREHPF LVKGGEDLRQ DQRVBQLFQV MNGILAQDSA 3780 

CSQRALQIiRT YSWPMTSRL GLIEWLENTV TLKDUiLNTM SQEEKAAYLS DPRAPPCBYK 3840 

DHLTKMSGKH DVGAYMLMYK GANRTETVTS FRKRESKVPA DLLKRAFVRM STSPEAPLAL 3900 

RSHPASSHAL ICISHWILGI GDRHLNMFMV AMETGGVIGI DFGHAFGSAT QFLPVPELMP 3960 

FRIiTRQFINZi MLPMKBTGLM YSIMVHALRA PRSDPGLLTM IWDVPVKEPS FDWKUPEQKM 4020 

LKKGGSWIQE INVAEKKWYP HQKICYAKRK LAGANPAVIT CDEXiLGHEK APAFRDYVAV 4080 
ARGSKDHNZR AQBPESGLSE ETQVKCLMDQ ATDPNILGRT WBQWEPHM 

Seq ID KO: 100 DNA sequence 
Nucleic Acid Accession H: NM_000673 
Coding sequence: 101-1225 

1 11 21 31 41 • 51 

i 1*1 I I 1 

ATGTGAAGGC ACAAGCTGCT GTTATATACA ACAGAGTGAA CTGAGCATCA 6TCAGAAAAA 60 

GTCTATGTTT GCAGAAATAC AGATCCAAGA CAAA6ACAGG ATGGGCACTG CTGGAAAAGT 120 

TATTAAATGC AAAGCAGCTG TGCTTTGGGA GCAGAAfiCAA OOCTTCTCCA TTGACGAAAT 180 

AGAAGTTGCC CCACCAAAGA CTAAAGAAGT TCGCATTAAG ATTTTGGCCA CAGGAATCTG 240 

TCGCACAGAT GACCATGTGA TAAAAGGAAC AATGGTGTCC AAGTTTCCAG TGATTGTGGG 300 

ACATGAGGCA ACTGGGATTG TAGAGAGCAT TGGAGAAGGA GTGACTACAG TGAAACXaCG 360 

TGACAAAGTC ATCCCTCTCT TTCTGCCACA ATGTAGAGAA TGCAATGCTT GTCGCAACCC 420 

AGAIGGCAAC CTTTGCATTA GGAGCGATAT TACTGGTCGT GGAGTACTCG CTGATGGCAC 480 

CACCAGATTT ACATGCAAGG GCAAACCAGT ACACQ^CTTC ATGAACACCA GTACATTTAC 540 

CGAGTACACA GTGGTG6ATG AATCTTGTGT TGCTAAGATT GATGATGCAG CTCCTCCTGA 600 

GAAAGTCTGT TTAATTGGCT GTGGGTTTTC CACTQGATAT GGOGCTGCTG TTAAAACTGG 660 

CAAGGTCAAA CCTGGTTCCA CTTGCGTCGT CTTTGGCCTG GGAGGAGTT6 GCCTGTCAGT 720 

CATCATGGGC TGTAAGTCAG CTGGTGCATC TAGGATCATT GGGATTGACC TCAACAAAGA 780 

CAAATTTGAG AAGGCCATGG CTGTAGGTGC CACTGAGTGT ATCAGTCCCA AGG ACTCTA C 840 

CAAACXXATC AGTGAGGTGC TGTCAGAAAT GACAGGCAAC AAOGTGGGAT ACAOCTTTGA 900 

AGTTATTCGG CATCTTGAAA CCATGATTGA TGOCCTGGCA TCCTGCCACA TGAAC TATGG 960 

GACCAGGGTG GTTGTAGGAG TTCCTCCATC AGOCRAGATG CTCACCTATG ACCCGAT6TT 1020 
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GCTCTTCACT GGACGCACAT GGAAGGGATG TGTCrTTGtSA GGrTTGAAAA GCAGAGATGA 1080 

TGTCCX:AAAA CTACrr^CTG AGTTCCTCGC AAWSAA^ 11*0 

TCATGrriTA CCATTTAAAA AAATCACTGA AGGATTTCAG CTGCTCAATT CAGGACAAAG 1200 

CATTOGAACG GTCCTGAOCrr TTTGAGATCC AAACTGGCAG C5AGGTCTGTG TTCTCATGGT 1260 

QUUTOGACT TTCTCrTCrG AGACTTOCT ^^20 

ACAAGCATAA GTAGAAGATT TCrTGAAGAC ATAGAACCCT TATftAAGAAT TATTAACCTT 1380 

TATAAACATT TAAAGTCTTG TGAGCACCTG GGAATTACrTA TAATAACAAT GTTAA IATTT 1440 

TTGATTTACA TTTTGTAAGG CTATAATTGT ATCTTTTAAG AAAACATACA CTTGGATTTC 1500 

TATGTTCAAA TCGAGATTTT TAAGAGTTTT AACCAGCTGC TGCAGATATA TAACTCAAAA 1560 

CaStATACC CTATAAACAT ATACTAAATG OITCTCCCAG AGTAATATTC ACTTAACACA 1620 

SSSCTW TATTTTTTAG ATTTGAATAT AAATt?r AT^ 1680 

TAACTTCGAT TACATTTTGA AATCAlGTTCA TTCCATGATO CATATTACTG GATTAGATTA 1740 

LuUVGACAG AAAAGATTAA GGGAOGGGCA CATTTTTW "00 

ATAACrTGGT GAAACTGAAA AAGTATATCA TATGGGTACA CAflCGCTATT TOCCRGCATA I860 
tI??2SmT TTAGAAAATA TTC^^ 

ATATTATCAT ACTTATCATA ATCTTCAATT TGATACAGTA GAATTGCA AG TCCCTRAGTC 1980 

OT™^ CHGCTTAGTA GTGACTCCAT TTAATAAAAA GTCTTTTTAG TTTCT 2040 

CTAAACCG 

Seq ID NO: 101 Protein sequence: 
Protein Accession |: NP_000664 

1 11 21 31 |1 SI 

Ltagkvikc kaavlweqkq pfsieeibva PPKTKBVRIK ILATGICRTD DHViraTMVS 
KFPVIVGHEA TGIVESIGEG VTTVKPGDKV IPLPUQCRB CMACRMPIXaff LCIRSDITGH 
GVLADGTTRP TCKGKPVHHP MNTSTFTEYT WDESSVAKI DUAAPPEKVC I'lGO^STGY 
GA^SgKVK PGSTCWFGI, GGVGLSVIKG CKSAGASRII GIDUJKDKFE KAMAVGATEC 240 
ISPlSScPI SEVLSEM1X2I NVGYTFEVIG HLETMIDALA SCHMNYGTSV WGVPPSAKM 300 
LTYDPMLLFT GRTWKGCVFG GLKSHDDVPK LVTEFLAKKF DLDQLITHVL PPKKISBGPE 360 
LUrSOQSIRT VLTP 

Seq ID NO: 102 DNA sequence 

Nucleic Acid Accession llMJ)06783.1 

Coding sequence: 1.-786 

1 11 21 31 41 51 

irGGArrGGG Lvcgctcca cactttcatc gggggtgtca acaaacactc caccmcatc 60 

gSaAGGTCT GGATCACAGT CATCTTTATr TTCCGAGTCA TGATCCTAGT GGTCOrGTO 120 

S^GTGT GGGGTGACGA GCAAGAGGAC TTCGTCTGCA ACACACTGCA ACCGG6MGC 180 

AAAAATGTGT GCTATGACCA CTTTTTCCOG GTGTCCCACA TCCGGCTGTG GGCCCTCCAG 240 

TCT^CCC AGOGCTGCTG GTG6CCATGC ATGTGGCCTA CTACAGGCAC 300 

GAAACCACTC GCAACTTCAG GOGAGGAGAG AAGAGGAATG ATTTCAAAGA CATAGAG6AC 360 

ATT^AAAGC ACAAGGTTCG GATAGAGGGG TCGCTGTGGT GGACGTACAC CAGCAGCATC 420 

TTTTTCCGAA TCATCTTTCA AGCAGCCTTT ATGTATGTGT TmCTTCCT TTACAftTGGG 480 

TACCACCTGC CCTGGGTGTT GAAATGTGGG ATTGACCCCT GCOCCSU^XTT TGTT6ACT6C 540 

TTTATTTCTA GGCCAACAGA GAAGACCGTG TTTACCATTT TTATGATTTC TGOCTCTGTG 600 

ATTTGCATGC TGCTTAAOGT GGCAGAGTTC TGCTACCTGC TGCTGAAAGT GTGTTTTAGG 660 

AGATCAAAGA GAGCACAGAC GCAAAAAAAT CACXICCAATC ATGCCCTAAA GGAGAGTAAG 720 

CAGAATGAAA TCAATCACCT GATTTCAGAT AGTGGTCAAA ATGCAATCAC AGGTTTCCCA 780 
AGCTAA 

Seq ID NO: 103 Protein sequence: 
Protein Accession ft: NP_006774.l 

1 11 21 31 41 51 

MDWGTLHTPI GGVNKHSTSI GKVWITVIFI FRVMILWAA QBVWGDEQED FVCNTLQPGC 60 

KNVCYDHFFP VSHIRLWALQ LIFVSTPALL VAMHVAYYRH BTPRKPRRGE KRNDFKDIED 120 

IKKHKVRIEC SLWWTYTSSI PFRIIFEAAF MYVPYFLYNG VHLPWLKOG IDPCPNLVDC 180 

PISRPTBKTV FTIPMISASV ICMLLNVAEL CYLLLiCVCPR RSKRAQTOKN HPNHALKESK 240 
QNEMMELISD SGQNAITGFP S 

Seq ID NO: 104 DNA sequence 
Nucleic Acid Accession i: NML.020411 
Coding sequence: 86-526 

1 11 21 31 41 SI 

GGACCTGGGA AGGAGCATAG GACAGGGCAA GGOGGGATAA GGAGGGGCAC CACAGCCCTT 60 

AAGGCACGAG GGAACCTCAC TCCGCATGCT CCTTTGGTGC CCACCTCAGT G OGCATG TTC 120 

ACTGGGCGTC TTCCCATCGG CCCCTTCGCC AGTGTGGGGA ACGCGOCGGA OCTGTGAGCC 180 

GGOGACTOGG GTCCCTGAGG TCTGGATTCT TTCTCCGCTA CTGAGACAOG GCGGACACAC 240 

ACAAACACAG AACCACACAG CCAGTCCCAG GAGCCCAGTA ATGGAGAGCC CCAAAAAGAA 300 

GAACCAGCAG CTGAAAGTCG GGATCCTACA CCTGGGCAGC AGACAGAAQA AGATCAGGAT 360 

ACAGCTGAGA TCCCAGTGCG CGACATG6AA GGTGATCTGC AAGAGCTGCA TCAGTCAAAC 420 

ACCGGGGATA AATCTGGATT TGCSGTTCOSG CGTCAAGGTG AAGATAATAC CTAAAGAGGA 480 

ACACTCTAAA ATGCCAGAAG CAGGTGAAGA GCAACCACAA GTTTAAATGA AGACA AGCTG 540 
AAACAACGCA A6CTGGTTTT ATATTAGATA TTIGACTTAA ACTATCTCAA TAAAGTTTTG 
CAGCTTTCAC CAAAAAAAAA AAAAAA 



600 



Seq ID NO: 105 Protein sequence: 
Protein Accession #: NP_06S144.1 

1 11 21 



51 



229 



10 
15 
20 
25 



WO 02/086443 

I I I I I I 

MLIjWCPPQCA cslgvfpsap spvwgtrrsc bpatrvpevm ilsfllregg htqtqshtas 

PRSPVMESPK KKNQQLKVGl LHLGSRQKKI RZQLRSQGftT HKVICICSC3S QTPGINLDW 
SGVKVKIIPK EEHCKMPEAG EBQPQV 

Seq ID NO: 106 DKA sequence 
nucleic Acid Accession #: J04129 
Ooding sequence: 99-587 



PCTAJS02/12476 



1 
I 

CAT COCTC TG 
TCACCCTGGG 
AGGACCTGGA 
ACATCTCCCT 
CC3VCCCCCGA 
AGAAGAAGGT 
TGGOGAACGA 
AGGACACCAC 
AGGACGATGA 
GGTACTTGCT 
CCAGGAAGAC 
TTTCAAA6AA 
TCCTGCTGCA 
GCAGAGGTTA 



11 
1 

GCTCCAGAGC 
GGTG6CCCTG 
GCTCCCAAAfi 
CATGGCGACA 
GGACAACCTG 
CCTTGGAGAG 
GGCCACX3CTG 
CACCCCCATC 
GATCATGCAG 
GGACTTGAAA 
CAGACTCCCA 
TAACCACAGC 
CAOCTGCACX: 
TTAATAAACC 



21 
I 

TCAGAGCCAC 
GTCTGTGGTG 
TTGGCAGGGA 
CTGAAGGCCC 
GAGATCGrrC 
AAGACTGGGA 
CTCGATACTG 
CAGAGCATGA 
GGATTCATCA 
CAGA3GGAAG 
CCCTTCCACA 
TCAGAAGA06 
ATTGOCATGG 
CTTGGAGCAT 



31 
I 

ccacacccgc 
toccggccat 
cx:tggcactc 
ctctgagggt 

TGCACAGATG 
ATOCAAAGAA 
ACTAOGACAA 
TGTGCCAGTA 
GQGCTTTCAG 
AGCCGTGCCG 
CCTCCAGAGC 
ATGAGGTGGT 
GGAGGCTGCT 
G 



41 

1 

AGCCATGCTG 
GGACATCCCC 
CATGGCCATG 
CCACATCACC 
GGAGAACAAC 
GTTCAAGATC 




SI 

I 

TGCCTCCTGC 
CAGACCAAGC 
GCGACCAACA 
TCACTGTTGC 
AGCTGTOITG 
AACTATAOGG 
CrCTGCCTAC 
GTCCTGGTGG 
AGGCACCTAT 
ACCTCG6CCT 
OCTCCTGCCC 
GCXJUGCCCT 
AGAGTCTCTG 



60 
120 



60 
120 

leo 

240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NO: 107 Protein sequence: 
Protein Accession 9* AAA60147 

1 11 21 31 41 SI 

I I ! I > I 

MDIPQTKQDL ELPKLAGTWH SMAMATNNIS LMATLKAPLR VHITSLLPTP EDNLBIVLHR 
WENNSCVEKK VLGEKTGNPK KPKINYTVAN BATtLDTDYD NFLFLCWJDT TTPIQSMMOQ 
YLARVLVEDD EIMQGPIRAF RPLPSHLWYL LDLRQMEEPC RP 

Seq ID NO: 108 DMA sequence 

Nucleic Acid Accession Eos sequence 

Coding sequence: 48-794 



TCCCAGGCAG 
GTCT6ATCCA 
TCATGAAAGG 
CAGTAGCCTA 
TTGAGCAGAA 
GGGAGAAGGT 
GCCACCTCAT 
GTGACTACTA 
ACTCAGCCCG 
CCAACCOCAT 
ACAGCCCCX3A 
TGCACACXXrr 
ACAACCTGAC 
AGCCCCAGAG 
TGCCGAGAGG 
CTCCAAAGGG 
CACTCTTCTT 
CGCACCCGCT 
CTGCCCCTGC 
GGACAGTGGC 
CGCGCGCGCC 
TTCCTCTCAA 



11 
I 

CAGTTAGCCC 
GAAGGCCAAG 
OGCOGTGGAG 
TAAGAACGTG 
AAGCAACGA6 
GGAGACTGAG 
CAAGGAGGCC 
CCX3CTACCTG 
GTCAGCCTAC 
CCGCCTGGGC 
G6AGGCCATC 
CAGCGAGGAC 
ACTGTGGACG 
CTGAGTGTTG 
ACTAGTATGG 
CTCCGTGGAG 
GCAGCTGTTG 
TCCTCCCGAC 
TGCCTCTGAT 
AGGGGCTGGA 
AGTGCAAGAC 
TAAAGTTCCC 



21 
I 

GCCGCCOGCX: 
CTGGCAGAGC 
AAGGGGGAGG 
GTGGGGGGCC 
GAGGGCTOQG 
CTCCAGQGCG 
GGGGACXjCGG 
GCOGAGGTGG 
CAGGAGGCCA 
CTGGOOCTGA 
TCTCTGGCCA 

tcctacaaag 
gcogacaaog 
cccgccaccg 
ggxgggaggc 
agggactggc 
agogcaccta 
cxx:aggacca 
c6taggaatt 

GATGGGTGTG 
CGAGATTGAG 
CTGTGACACT 



31 
I 

TGTGTGTCCC 
AGGCCGAACG 
AGCTCTCCTG 
AGAGGGCTGC 
AGGAGAAGGG 
TGTGCGACAC 
AGAGCCGGGT 
CCACCGGTGA 
TGGACATCAG 
ACTTTTCCGT 
AGACCACTTT 
ACAGCACCCT 
CCGGGGAAGA 
CCCCGCCCTG 
CCCACCCTTC 
AGAGCTGAGG 
ACCACTGGTC 
GGCTACTTCT 
GAGGAGTGTC 
TGTGTGTGTG 
GGAAAGCATG 
C 



41 

1 

CAGAGCCATG 
CTATGAGGAC 
CGAAGAGCGA 
CTGGAGGGTG 
GOCCGAQGTG 
CGTGGTGGGC 
CTTCTACCTG 
CGACAAGAAG 
CAAGAAGGAG 
CTTCCACTAC 
CGAOGAGGCC 
GATCATGCAG 
GGGGGGCGAG 
CCCCCTCCAG 
TCCCCTAGGC 
CCACCTGGGG 
ATGCCCCCAC 
GOCCTCCTCT 
C06CCTTGTG 
TGTGTGTGTG 
TCTGCTGGGT 



51 
I 

GAGAGAGCCA 
ATGGCAGCCT 
AACCTGCTCT 
CTGTCCAGTA 
06TGA6TACC 
CTGCIGGACA 
AAGAT6AAGG 
CGCATCATTG 
ATGCCGCCCA 
GAGATCGCCA 
ATGGCTGATC 
CTGCT6C6AG 
GCTCCdCAGG 
TCCCCCACCC 
GCTGTTCTTG 
CTGGGGATCC 
CCCTGCTCTC 
TGCCTCCCTC 
GCTGAGAACT 
TGTGTGTGTG 
GTGACCATGT 



Seq ID NO: 109 Protein sequence: 
Protein Accession fl: NP_006133.l 



11 



21 
1 



31 



41 



51 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
S40 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 



MERASLIQKA KLAEQAERYE DMAAFMKGAV EKGEELSCEE RNLLSVAYKN WGGQRAAWR 
VLSSIEQKSN EEGSEEKGPE VREYRBKVET ELQGVCDTVL GLLDSHLIKE AGDAESRVFY 
LKMK(H>yyRY LAEVATGDDK KRIIDSARSA YQEAMDISKK EMPPTNPIRL GLALNFSVFH 
YEIANSPEEA ISLAKTTPDE AMADLHTLSE DSYKDSTlilM QLLRQNLTLH TADNAGEBGG 
EAPQEPQS 

Seq ID NO: 110 DNA sequence 
Nucleic Acid Accession #: NM_000695 
Coding sequence: 407-1564 

1 11 21 31 41 SI 

I I i I i I 

CACGAGTTGG TTTGGGAGCT GCCAGTCTCC TGGGAGGATC GCAGTCAGCA GAGCAGGGCT 
GAGGCCTGGG GGTAGGAGCA GAGCCTGCGC ATCTGGAGGC AGCATGTCCA AGAAAGGGAG 
TGGAGGTGCA GCGAAGGACC CAGGGGCAGA GCCCAOGCTG GGGATGGACC CCTTOGAGGA 
CACACTGCGG OGGCTGOGTG AGGCCTTCAA CTGAGGGCGC AGG0GGCCG6 CCGAGTTCCG 
GGCTGCGCAG CTOCAGGGCC TGQGCCACTT OCTTCAAGAA AACAAGCAGC TTCTGCGOSA 



60 
120 
180 
240 



60 
120 
180 
240 
300 



230 



wo 02/086443 

OGTGCIGGOC CAGGACCTGC ATAAGCCAGC TTTOCSAGGCA GACATATCTG AGCTCATCCT 360 

TTGCCAGAAC GAGGTTGACT AOGCTCTCAA GAACCTTCAG GCCIGGATGA AOSMGAACC 420 

ACGGTCCACG AACCTGTTCA TGAAGCIGGA CTCGGTCITC ATCTGGAAGG AACXXTTTCG 480 

CCTOGTCCTC ATCATOSCAC CCTGGAACTA CCCATTGAAC CTGAOCCTGG TGCTCCTGGT 540 

GGGCACCCTC CCOGCAGQGA ATTGOGTGGT GCrOAAGCQG TCAGAAATCA GCCAGGGCAC 600 

AGAGAAGGTC CTGGCTCftGQ TGCTGCCCX» GTACCTGGAC CAGAGCTGCT TTGCCGTGGT €60 

GCTGGGCGGA CCCCAGGAGA CAGGGCAGCT GCTAGAGCAC AAGTTGGACT ACATCTTCTT 720 

CACAGGGAGC CCTCGTGTGG GCAAGATTGT C3^TGACTGCT GCCACCAAGC ACCTGAOGCC 780 

•TOTCACCCTG GAGCTGGGGG GCAAGAACCC CTGCTA0C3TG GACGACAACT GOGACCCCCA 840 

GAOOyrGGCC AACOGCGTOG CCTGGTTCTG CTACTTCAAT GCOGGCCAGA CCTGCGTGGC 900 

CCCTGACTAC GTCCXGTOCA GCCCOGAGAT 6CAGGA6AGG CTGCTGCCCG CCCTGCAGAG 960 

CACCATCACC CGTTTCTATG GOGACGACCC CCAGAGCTCC CCAAACCTGG GCCGCATCAT 1020 

CAACX:aGAAA CAGTTCCAGC GGCTGCGGGC ATTGCTGGGC TGOSGCOGOG TCGCCATTGG 1080 

GC3GCCAGAGC AACGAGAGCG ATCGCTACAT CGCCCCCAOG GTGCTGGTGG ACGTCCAGGA 1140 

GAOXSAGCCT GTGATGCAGG AGGAGATCTT CGGGCCCATC CTGCCCATCG TGAACGTGCA 1200 

GAGCGTGGAC GAGGCCATCA AGTTCATCAA CCX»CAGGAG AAGCCCCTGG CCCTGTACGC 1260 

CTTCTCCAAC AGCAGACAGG TTGTGAACCA GATGCIGGAG CQGACCAGCA GCGGCAGCTT 1320 

Sa^CITCA CCTACATATC TCTGC^ 1380 

CCACAGTGGG ATGGGCCX3GT ACCACGGCAA GTrCACCTTC GACAOCTTCT CCCACCACCG 1440 

CACCTGCCTG CTCGCCXXCT CCGGCCTGGA GAAATTAAAG GAGATCX3GCT ACXKACOTA 1500 

TACCGACTGG AACCAGCAGC TGTTACGCTG GGGCATGGGC TCCCAGAGCT G^CCCTCCT 1560 

GTCAGCGTCC CACCCGCCTC CAACGGGTCA CACAGAGAAA CCTGAGTCTA GCCATGAGGG ^'^'n 
GCTTATGCTC CCAACTCACA TTGTTCCTCC AGACCGCAGG CTCCCCCAGC CTCAGGTTCC 
TGGAGCTGTC ACATGACTGC ATCCTCCCTO CCaGGGCTGC AAAGCAAGGT CTI^II^Ift 

TCTGGGGGAC GCTGCTOGAG AGAG6CCGAG AGGCCGCAGA ACATGCCAGG TGTCCTCACT 1800 

OVCCCCACCC TCCCCAATTC CAGCCCTTTG CCCTCTC®^ "60 

CACAGGGGCA GTGTCACCCT GGAAAATACA GTGCCCTGCC TTCTTAGGGG CATCAGCCCT 1920 

QAACGGTTGA GAGCGIGGAG CCCTCCAGGC CTTTGCTCTC CCCTCTAGGC ACACGCGCAC 1980 

TOCACCTCT GCCCCATCCC AACTGCACCA GCACTGCCTC CCCCAGC5GAT CCTCTCACAT 2O40 

CCCACACTGG TCTCTGCACC ACCCCTCTG6 TTCACACCGC ACCCTGCACT CACCCACAGC 2100 

AGCTCCATCC ACIGGGAAAA CTGGGGTTTG CATCACTCCA CTGCACA6TG TTAGTGGGAC 2160 

CTGGGGGCAA GTCCCTTGAC TTCTCTGAGC CTCAGTTTCC TTATGTGAAA GTTGCTGGAA 2220 

CCAAAATGGA GTCACTTATG CCAAACTCTA ATAAAATGGA GTCGGGGGGG CACATAGAAG 2280 

CCCTCACACA CACATGCCCG TAACAGGATT TATCACCAAG ACACGCCTGC ATGTAAGACC 2340 

A6ACACAGGG CGTATGGAAA AGCACGTCXTT CAAAGACTGT AGTATTCC3^G ATGAGCTGCA 2400 

GATGCTTACC TACCAOGGCC GTCTCCACCA GAAAACCATC GCCAACTCCT GCGATCAGCT 2460 

TGTGACTTAC AAAOCTTGTT TAAAAGCTGC TTACATGGAC TTCTGTCCTT TAAAACGTTC 2520 

CCCTTGGCTG TGGCCCTCTG TGTATGCCTG GGATCCTTCC AAGCACTCAT AGCCCAGATA 2580 
GGAATCCTCT GCTCCTCCCA AATAAATTCA TCTGTTC 



1620 
1680 
1740 



Seq ID NO: 111 Protein sequence: 
protein Accession #: NP__0006a6 

1 11 21 31 41 51 

jSnCDEPRSTNL FMKLDSVFIW KEPFGLVLII APWNYPLNLT LVLLVGTLPA OICWLKPSE 60 

ISQGTBKVLA EVLPQYLDQS CPAWLGGPQ ETGQLLEHKL DYIFFTGSPR VGKIVMTAAT 120 

KHLTPVTLEL GGKNPCYVDD NCDPffTVANR VAWFCYPNAG QTCVAPDYVL CSPEMQERLL 180 

PALQSTITRP YGDDPQSSPM LGRIINQKQP QRLRALLGCG RVAIGGQSNE SDRYIAPTVL 240 

VDVQSTEPVM QEEIPGPILP IVNVQSVDEA IKFINRQEKP LALYAPSNSR QWNQMLERT 300 

SSGSFGGNEG FTYISLLSVP FGGVGHSGMG RYHGKPTFDT PSHHRTCLIA PSGLEKLKEI 360 
RYPPYTDWNQ QLLRWGMGSQ SCTLL 



Seq ID NO: 112 ONA sequence 
Nucleic Acid Accession NM_0044S6 
Coding sequence: 58-2298 

1 11 21 31 41 51 

GAATTCCGGG C6ACGCGCGG GAACAACGCG AGTOGGCGCG CGGGACGAAG AATAATCATG 60 

GGCCAGACTG GGAAGAAATC TGAGAAGGGA CCAGTTTGTT GGCGGAAGCG TGTAAAATCA 120 

GAGTACATGC GACTGAGACA GCTCAAGAGG TTCAGACGAG CTGATGAAGT AAAGAGTATG 180 

TTTAGTTCCA ATCGTCAGAA AATTTTGGAA AGAACGGAAA TCTTAAACCA AGAATGGAAA 240 

CAGCGAAGGA TACAGCCTGT GCACATCCTG ACTTCTGTGA GCTCATTGGG OGGGACTAGG 300 

GAGTGTTCGG TGACCAGTGA CTTGGATTTT CCAACACAAG TCATCCCATT AAA GACTC TG 360 

AATGCAGTTG CTTCAGTACC CATAATGTAT TCTTGGTCTC CCCTACAGCA GAATTTTATG 420 

6TGGAAGATG AAACTGTTTT ACATAACATT CCTTATATGG GAGATGAAGT TTTAGATCAG 480 

GATGGTACTT TCATTGAAGA ACTAATAAAA AATTATGATG GGAAAGTACA CGGGGATAGA 540 

GAATGTGGGT TTATAAAT6A TGAAATTTTT GTGGAGTTGG TGAATGCCCT TGGTCAATAT 600 

AATGATGATG ACGATGATGA TGATGGAGAC GATCCTGAAG AAAGAGAAGA AAAGCAGAAA 660 

GATCTGGAGG ATCACCGAGA TGATAAAGAA AGCCGCCCAC CTCGGAAATT TCCTTCTGAT 720 

AAAATTTTGG AGGCCATTTC GTCAATGTTT CCAGATAAGG GCACAGCAGA AGAACTAAAG 780 

GAAAAATATA AAGAACTCAC CX3AACAGCAG CTCCCAGGCG CACTTCCTCC TGAATGTACC 840 

CCCAACATAG ATGGACCAAA TGCTAAATCT GTTCAGAGAG AGCAAAGCTT ACAC TCCTTT 900 

CATAOGCTTT TCTGTAGGCG ATGTTTTAAA TATQACTGCT TCCTACATCC TTTTCATGCA 960 

ACACCCAACA CTTATAAGCG GAAGAACACA GAAAC AGCT C TAGACAACAA ACCTTGTCGA 1020 

CCACAGTGTT ACCAGCATTT GGAGG6AGCA AAGGAGTTTG CXGCTGCTCT CACCGCT6AG 1080 

CGGATAAAGA CCCCACCAAA AOSTCCAGGA GGCOGCAGAA GAGGACGGCT TCCCAATAAC 1140 

' AGTAGCAGGC CCAGCACCCC CACCATTAAT GTGCTGGAAT CAAAGGATAC AGACAGTGAT 1200 

AGGGAAGCAG GGACTGAAAC GGGGGGAGAG AACAATGATA AAGAAGAAGA AGAGAAGAAA 1260 

GATGAAACTT CGAGCTCCTC TGAA6CAAAT TCTCGGTGTC AAACACCAAT AAAGATGAAG 1320 

CCAAATATTG AACCTCCTGA GAATGTGGAG TG6AGT0GT6 CTG AAGCC TC AATGTTTAGA 1390 

GTCCTCATTO GCACTTACTA TGACAATTTC TGTGCCATTG CTAGGTTAAT TGGGACCAAA 1440 

ACATGTAGAC AGGTGTATGA GTTTAGAGTC AAAGAATCTA GCATCATAGC TCCAGCTCCC 1500 

GCTGAGGATG TGGATACTCC TCCAAGGAAA AAGAAGAGGA AACAC CGGT T GTGGGCTGCA 1S60 

CACTGCAGAA A6ATACAGCT GAAAAAGGAC GGCTCCTCTA ACCATGTTTA CAACTATCAA 1620 



231 



wo 02/086443 

CCCTOTGKTC ATCCAlOQGCA GCCTTGTCaC AGTTOGTGCC CTTGTGTGAT ACCACAAAAT 1680 

rrTTGTGAAA AGTTTTGTCA ATGTAGTTCA GAGTGTCAAA ACOGCTTTCC GGGATGCOGC 1740 

TGCAAAGCAC AGTGCAACAC CAAGCAGTGC COSTGCTACC TGGCTGTCOG AGAGTGTtaC 1800 

CCXGAOCTCT GTCTTACTrG TGGAGCOGCT GACCATTGGG ACAGTAAAAA TGTGTCCTGC 1860 

AAGAACIGCA GTATTCAGOG GGGCTCCAAA AAGCATCTAT TGCTGGCACC ATCTGAOGTG 1920 

GCAG6CTGG6 GGATTTTTAT CAAAGATCCT GTGCAGAAAA ATGARTTCAT CTCAGAATAC 1980 

TCTCGAGAGA TTATTTCTCA AGATGAAGCT GACAGAAGflG GGAAAGTGTA TGAtAAATAC 2040 

ATGTGCAGCT TTCTGrTCaA CTTGAACAAT GATTTTGIGG TGGATGCAAC O OOO AGGGT 2100 

AACAAAATTC GTTTTGCAAA TCATTCGGTA AATOCAAACT GCTAT6CAAA AGTTATGATG 2160 

GTTAACGGTG ATCACAGGAT AGGTATTTTT GCCAAGAGAG CCATCCAGAC TGGCGAAGAG 2220 

CTGTTEGTTG ATTACAGATA CAGCCAGGCT GATGCCCTGA AGTATC7CGG CATCGAAAGA 2280 

GAAATGGAAA TCCCTTGACA TCTGCTACCT OCTCCCCCTC CTCTGAAACA GCTGCCTTAG 2340 

CTTCAQGAAC CTOSAGTACT GTGGGCRATT TAGAAAAAGA ACATGCAGTT T GAAATTC TG 2400 

AATTTGCAAA GTACTGtAAG AA2CAATTTAT AGTAATCAGT TTAAAAATCA ACTTTTTATT 2460 

GCCTTCrCAC CACCrcCAAA GTGrrTTGTA CCAGTGAATT TTTGCAATAA TGCAGTATGG 2520 
TACATTTTTC AACTTTGAAT AAAGAATACT TGAACTTGAA AAAAAAAAAA AAAAAA 



Seq ID NOt 113 Protein sequence: 
Protein Accession ft: NP_004447 

1 11 21 31 41 51 

KGQTCKKSEK GPVCWRKRVK SBYMRLRQIiK RFRHADEVXS MFSSNRQKIL ERTEILKQEW 60 

KQRRIQPVHI LTSVSSLRGT RECSVTSDLD FPXQVIPLIcr LUAVASVPIM YSWSPI^QNF 120 

MVEDETVMN IPYMGDEVLD QDGTPIBBLI KNVDGKVHGD REOGPIMDBI FVBLWALGQ 180 

YNDDDTODDG DDPBBREEKQ KDItBDHRDDK BSRPPRKPPS DKILEAISSM PPDKDTAEBL 240 

KEKYKELTEQ QLPGALPPEC TPNIDGPNAK SVQREQSLHS FHTI*PCRRCF KXDCPLHPFK 300 

ATPNTYKRKN TETAU)NKPC GPQCYQHLBG AKEPAAALTA ERIKTPPfCRP GGRRRGRLPS 360 

MSSRPSTPTI NVLESKDTDS DBEA6TBTGG ENNDKEEEBK KDETSSSSEA NSRCQTPIKM 420 

KPSIEPPENV EWSGAEASMP RVLIGTYVDH FCAIARLIGT KTCROVYEFR VKBSSIIAPA 480 

PAEDVDTPPR KKKRKHRLWA AHCRKIQLKK DGSSNHVYNY QPCDHPRQPC DSSCPCVIAQ 540 

NPCEKFCQCS SECQNRFPGC RCKAQCNTKQ CPCYLAVREC DPDLCI.TCGA ADHHOSKUVS 600 

CKNCSIQRGS KKHLLLAPSD VAGWGIFIKD PVQKSEPISE YCX5BIISQDE ADRRGKVYDK 660 

YMCSFLFNLN NDPWDATRK GNKIRFANHS VNPNCYAKVM MVNGDHRIGI PAKRAIQTGE 720 
ELFVDYRYSQ ADALKYVGIB REMEIP 

Seq ID NO: 114 Z>HA sequence 
Nucleic Acid Accession fts NM_001827 
Coding sequence: 96-335 

1 11 21 31 41 SI 

AGTCICOGGC GAGTTGTTGC CTGGGCTGGA CXSTGGTTTTG TCTGCTGCGC CCX5CTCTTCG 60 

aSCTCTCGTT TCATTTTCTG CAGOSCGCCA CGAGGAT6GC CCACAAGCAG ATCTAC TACT 120 

CGGACAAGTA CTTOSACGAA CACTACGAGT ACXX3GCATGT TATGTTACCC AGAGAACTTT 180 

CXAAACAAGT ACCTAAAACT CATCTGATGT CTGAAGAGGA GTGGAGOAGA CTTGGTGTCC 240 

AAC3«3AGTCT AGGCTGGGTT CATTACATGA TTCATGAGCC AGAACCACAT ATTCTTCTCT 300 

TTAGACX5ACC TCTTCCAAAA GATCAACAAA AATGAAGTTT ATCTGGGGAT CGTCAAATCT 360 

TTTTCAAATT TAATCTATAT GTGTATATAA GGTAGTATTC AGTGAATACT TGAGAAATGT 420 

ACAAATCITT CATCCATACC TGTGCATGAG CTGTATTCTT CACAGCAACA GAGCTCAGTT 480 

AAATGCAACT GCAAGTAGGT TACTGTAAGA TGTTTAAGAT AAAAGTTCTT CCAGT CAGTT 540 

TTTCTCTTAA GTGCCTGTTT GAGTTTACTG AAACAGTTTA CTTTTGrTCA ATAAAGTTTG 600 
TATGTTGCAT TTAAAAAAAA AAAAAAA 



Seq ID NO: lis Protein sequence: 
Protein Accession ft: NP_001818 



1 11 21 31 41 51 

1 I I I I i 

MAHKQIYYSD KYFDEHYEYR HVMLPRELSK QVPKTHLMSE EEWRRL6VQQ SLGHVKYMIR 
EPEPHILLPR RPIjPKDQQK 



Seq ID NO: 116 DNA sequence 

Nucleic Acid Accession ft: CAT cluster 



1 11 21 31 41 51 

I 1 I ^ > ' 

TCAGACCTCA TGAGTCACTT GGACTCTTGA GCCACCTCTG GGGGTGGAGT CTCTCTCCTG 60 

GCATCTGGAC CCTTGGTGCT ATCGACGAAG CTTGGGTGGG GCTCTTAGCT GCTATGTGCA 120 

AGAGGT6TGT TCCAGGGAAA GCCCCTATCT CTCTGCAGAG GTCAAGTGAA AGOGAOGGCC 180 

GCAGCCAACA GAGTTCAAAA TGCAGGCTTG GAAAGTACAG GGGGCTCTGT GGAGGATGGG 240 

AAGGACTGAT CCACATTOCC ACCAGGAAGT TTAGCAGAAC CC0C3GCGTGC CAACTGGACX: 300 

CCTTGGAAGG ACCTGGCTCA GGCTGGACCA CCTCTTGAGA GGGAGGAGCT CTGGATTTGA 360 

TCAAGAATTC TTTGCTGAGC ATGGTGCCTC ATCCCTATAA TACCAACACT TTGGGAGGCC 420 

AGTGTGGGAG GATCTCTTGA GCCCAGGAGT TCAAGACTAG CCTGGGCAAC ACAGAGAGAA 480 

CCCATCTCTA AAATAATAAT AATAATAAAA TAAAAAATTA GCAGGGCATG GTGGCATGTG 540 

CCTGIAGTTC CAGCTAGCCA GGAGGCTGAG GCAAGAGGAT GGCIGGA6GC TGGGATGTTG 600 

AGGCT6CAAT GAACTGTGAT TACCCCACTG CACTCCA6CC TGGGCAAAA6 AGOQAGAGAA 660 

CCTGTCTCAA ATAATAATAA TAATAATAAT CTTATTTTGG AGAATAAAGA GACCTCTGGA 720 

TTTGAGGTGC CATTTGGGTA GAAAGAAAAG ACGTTTACAC CGAGAAATAG TCTGTGTTGC 780 

OCTGAAGGAG CAGAGGGATG CATOGCTGGA GGTGACCTAC AGTTGAAGAA GACTCATTAT 840 

GACAGACCTT GTCCTTCTTC CTTGTGGAAA GTGTTTCCTC TGCTGCTACT GCTCATGAGA 900 

CTCTTCCCCC TCCXrrGTCCC AGGGAACCAA AGGGCTTTCT AC CACAC CCT TTCTTGCXXrC 960 

COGCCTCCCA TGTCTGCTGT GCCTTTGTAC TCAGCAATTC TTGTTTGCTC CAT TATCTTC 1020 

CAGOXSGATA CAGAGTGAAT AGTTAACCAC ACTTAGCSTCA AATAG6ATCT AAATTTTTGT 1080 

TCCTGCTCCG T6TAAACAGG CCAGTGTTTG TGTGTTGCAA GCAGCXTTTOS AATAGTAACT 1140 
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CrrCTCATTP GTTTGOGATC TGGCCACCAA GTrCCAGAAT GATACAOGGA TCAGTGCACA 1200 

AGTTCATCAG GCTCTOGGAC CTTAQCSGCTG TrCGAGAAGG CTTCACCftGC AGAACTGATO 1260 

GTGAAGGCTC GTGTTCTCCA TOCTCAACTT TCTITGCTTC GATCATACAC AAGAATACAT 1320 

ITCGAAGGGC AAAAAATGAA CACTGTCGTT CATTGCAGCC GTCTTTTGTG ACaCaCATGC 1380 

ACSVGTCTGCT GTGAAGACCT TCTCTCAAGT GGCATTTGGG AGTCCATGCC AGATCATGGT 1440 

GCTTCAtGAG AGACTGACAG CTATCAGGGG TTCTGGCACT TAGTGAGGAC TCTCCTCCCC 1500 

CAGTGTGTCC TGATGACACA TACACACCTG ACAATAGCTT GAGTCTTCTC TGTTCCTTTT 1560 

ACTCTGTAGC CAACATACAC ATGATTTAAA ACOCmCTA AATATCTATC ATGGTTCATC 1620 

CTTGTCX»AA TGCAGAGTCA GAGCTATTTG TACTTCATTA TTATTTCCAA GGCGAATAGT 1680 

• i miC WC i'l' ntGCAAAAA TAATTAAAGT TTTTGTATGT TGCAAAAAAA AAAAAAAAAA 1740 
AAACAAAAAA 

Seq ZD NO: 117 DMA sequence 

Nucleic Acid Accession i: BC012178.1 

Coding sequence: 204'2285 

1 11 21 31 |1 51 

CTTCTCrCCC GCGGCGCTGG GGCCCGOSCT CCGCTGCTGT TGCTCCATTC GGOGCTTTTC 60 

TGGOGGCTGG CTCCTCTCCG CTGCCGGCTG CTCCTCGACX: AGGCCTCCTT CTCAACCTCA 120 

GCCOGCGGCG OCGACXXTTTC CQGCACCCTC (XOCCOCGTC TCGTACTCTC GCCGTCACCG 180 

CCGOGGCTCC GGCCCTGGCC COGATGGCTC TGTGCAACGG AGACTCCAAG CTGGAGAATG 240 

CTGGAGGAGA CCTTAAGGAT GGCCACCACC ACTATGAAGG AGCT GTTGTC ATTCTGGATG 300 

CTGGTCCTCA GTAOGGGAAA GTCATAGACC GAAGAGTGAG GGAACTGTTC GTGCAGTCT6 360 

AAATTTTCCC CTTOGAAACA CCAGCATTTG CTATAAAGGA ACAAGGATTC OGTGCTATTA 420 

TCATCTCTGG AQGACCTAAT TCTGTGTATG CTGAAGATGC TCCCTGGTTT GATCCAGCAA 480 

TATTCACTAT TGGCAAGCCT GTTCTTGGAA TTTGCTATGG TATGCAGATG ATGAATAAGG 540 

TATTTCGAGG TACTCTGCAC AAAAAAAGTG TCAGAGAAGA TGGAGTTTTC AACATTAGTG 600 

TGGATAATAC ATOTTCATTA TTCAGGGGCC TTCAGAAGGA AGAAGTTGTT TTGCTTACAC 660 

ATGGAGATAG TGTAGACAAA GTAGCTGATG GATTCAAGGT TGTGGCACGT TCTGGAAACA 720 

TAGTAGCAGG CATAGCAAAT GAATCTAAAA AGTTATATGG AGCACAGTTC CACCCTGAAG 780 

TTGGCCTTAC AGAAAATGGA AAAGTAATAC TGAAGAATTT CCTTTATGAT ATAGCTGGAT 840 

GCAGTGGAAC CTTCACCXyiG CAGAACAGAG AACTTGAGTG TATTCGAGAG ATCAAAGAGA 900 

GAGTAGGCAC GTCAAAAGTT TTGGTrTTAC TCAGTGGTGG AGTAGACICA ACAGTTTGTA 960 

CAGCTTTGCT AAATCGTGCT TTGAACCAAG AACAAGTCAT TGCTGTGCaC ATTGATAATG 1020 

GCTTTATGAG AAAACGAGAA AGCCAGTCTG TTGAAGAGGC CCTCAAAAAG CTTGGAATTC 1080 

AGGTCAAAGT GATAAATGCT GCTCATTCTT TCTACAATGG AACAACAACC CTACCAATAT 1140 

CAGATGAAGA TAGAACCCCA CGGAAAAGAA TTAGCAAAAC GTTAAATATG ACCACAAGTC 1200 

CTGAAGAGAA AAGAAAAATC ATTGGGGATA CTTTTGTTAA GATTGCCAAT GAAGTAATTG 1260 

GAGAAATGAA CTTGAAACCA GAGGAGGTTT TCCTTGCCCA AGGTACTTTA OGGCCTGATC 1320 

TAATTGAAAG TGCATCCCTT GTTGCAAGTG GCAAAGCTGA ACTCATCAAA ACCCATCACA 1380 

ATGACACAGA GCTCATCAGA AAGTTGAGAG AQGAGGGAAA AGTAATAGAA CCTCTGAAAG 1440 

ATTTTCATAA AGATGAAGTG AGAATTTTGG GCAGAGAACT TGGACTTCCA GAAGAGTTAG ISOO 

TTTCCAGGCA TCCATTTCCA GGTCCTGGCC TGGCAATCAG AGTAATATGT GCTGAAGAAC 1560 

CTTATATTTG TAAGGACTTT CCTGAAACCA ACAATATTTT GAAAATAGTA GCTGATTTTT 1620 

CTGCAAGTGT TAAAAAGCCA CATACCCTAT TACAGAGAGT CAAAGCCTGC ACAACAGAAG 1680 

AGGATCAGGA GAAGCTGATC CAAATXACCA GTCTGCATTC ACTGAATGCC TTCTTGCTGC 1740 

CAATTAAAAC TGTAGGTGTG CAGGGTGACT GTCGTTCCTA CAGTTACGTG TGTGGAATCT 1800 

CCAGTAAAGA TGAACCTGAC TGGGAATCAC TTATTTTTCT GGCTAGGCTT ATACCTOGCA 1860 

TGTGTCACAA CXTTTAACAGA GTTGTTTATA TATTTGGCCC ACCAGTTAAA 6AACCTCCTA 1920 

CAGATGTTAC TCCCACTTTC TTGACAACAG GGGTGCTCAG TACTTTACGC CAAGCTGATT 1980 

TTGAGGCOCA TAACATTCTC AGGGAGTCTG GGTATGCTGG GAAAATCAGC CAGATGCCGG 2040 

TGATTTTGAC ACCATTACAT TTTGATCGGG ACCCACTTCA AAAGCAGCCT TCATGCCAGA 2100 

GATCTGTGGT TATTCGAACC TTTATTACTA GTGACTTCAT GACTGGTATA CCTGCAACAC 2160 

CTGGCAATGA GATCCCTGTA GAGGTGGTAT TAAAGATGGT CACTGAGATT AAGAAGATTC 2220 

CTGGTATTTC TCGAATTATG TAT6ACTTAA CATCAAAGCC CCCAGGAACT ACTGAGTGGG 2280 
AGTAATAAAC TTCTTGTTCT ATTAAAA 



Seq ID NO: 118 Protein sequence: 
Protein Accession #: AAH12178.1 

1 11 21 31 41 SI 

MAiiOraDSKL Laggdlxdg HHHYEGAWI LDAGAQYGKV IDRRVRELFV QSEIFPLETP 60 

APAIKEQGPR AIIISGGPNS VYAEDAPWFD PAIFTIGKPV LGICYGKQMM NKVFGGTVHK 120 

KSVREDGVFN rSVDNTCSLP RGLQKEEWL LTHGDSVDKV ADGFKWARS GNIVAGIANB 180 

SKKLYGAQFH PEVGLTENGK VILKNFLYDI AGCSGTFTVQ NRBI*BCIRBI KERVGTSKVL 240 

VLLSGGVDST VCTALLNRAL NQEQVIAVHI DNGFMRKRES QSVEEALKKL GIQVKVINAA 300 

HSFYNGTTTL PISDEDRTPR KRISKTUIMT TSPEEKRKII GDTPVKIANE VIGEMNLKPE 360 

EVFLAC3GTIJI PDLIESASLV ASGKAEWKT HHNDTELIRK LREEGKVIEP LKDFHKDEVR 420 

ILGRELGIiPE ELVSRHPPPG PGLAIRVICA EEPYICKDFP ETNNILKIVA DFSASVKKPH 480 

TLLQRVKACr TEEDQEKLKQ ITSLHSMIAF LLPIKTVGVQ GDCRSYSYVC GISSKDEPDM 540 

ESIiIFLARLI PRMCHNVHRV VYIFGPPVKE PPTDVTPTPL TTGVLSTLRQ ADPEAHNIUl 600 

ESGYAGKISQ MPVILTPLHF DRDPLQKQPS OQRSWIRTF ITSDFMTGIP ATPGHEIPVE 660 
WIiKMVTEIK KIPGISRIMY DLTSKPPGTT EWE 

Seq ID NO: 119 ONA sequence 

Nucleic Acid Accession 8: NM_006S00.1 

Ooding sequence: 27.. 1967 ' 

1 11 21 31 41 SI 

I 1 ) I i 1 

ACTTCCGTCT CXXrCCTCCGG CCAAGCATGG GGCTTCCCAG GCTGGTCTGC GCCTTCTTGC 60 
TOGCCGCCTG CTGCTGCTGT CCTCGOGTCG CGGGTGTGCC CGGAGAGGCT GAGCAGCCTG 120 
06CCTCAGCT GG7GGAGGTG GAAGTGGGCA GCACAGCCCT TCTGAAGTGC GGCCTCTOCC 180 
AGTCCCAAGG CAACCTCAGC CATGTCGACT GGTTTTCTGT OCACAAGGAG AAG0GGAC3GC 240 
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TCATCrrCCG TCTGCGCCAG GGCCAGQGCC AGAGOSAACC TGGGGAGTAC GAGCACOGGC 300 

TCAGCCTCCA GGACAGW3GG GCTACTCTGG CCCTGACTCA AGTCACCCCC CAWS^OGAGC 360 
GCftT Cri t.i t GTGOCftGGGC AAGCGOCCTC GQTCCCAGGA GTACOGCATC CAGCTCCGCG 420 

TCTACAAAfiC TCOGGAOGAG AGGTCAACCC CCTGGGCATC CCTCTGAACA 480 

GTAAGGAGCC TGAGGAflGTC GCTACCTGTG TAGGGAGGAA CGGGTACCCC ATTOCTCAAG 540 

TCATCTGGTA CAAGAATGGC OGGCCTCTGA AGGAGGAGAA GAACCXX3GTC CACATTCAGT 600 
CGTCCCAGAC TGTGGACTCG AGTGGTTTGT ACACCTTGCA GABTATTCTG AAGGCACAGC 660 
TCGTTAAAGA AGACAAftGAT GCOCAGTTTT ACTGTGAGCT CaUWrtACOGG CTGCCCAGTG 720 
GGAACCACAT GAAGGAGTCC AGGGAAGTCA CCGTCCCTGT TTTCTACCCG ACAGAAAAAG 780 
TGTGGCTGGA AGTCGAGCCC GTGGGAATGC TGAAGGAAGG GGACCGCGTG GAAATCAGGT 840 
GTTrcGCTCA TGGCAACCCT CCACCACACT TCAGCATCAG CAAfiCAGAAC CCCAGCACXA 900 
CGGAGGCACA GGAAGAGACA ACCAACGACA ACGGGGTCCT GCJTGCTGGAG CCTGCCOGGA 960 

AGGAACACAG TGGGCGCTAT GAATGTCAGG CCTGGAACTT GGACACCATG ATATOCCTGC 1020 

TGAGTQAACC ACAGGAACTA CTGGTGAACT ATCTGTCTGA CGTCCGAGTG AGTCCCGCAG 1080 

CCCCTGAGAG ACAGGAAGGC ACCAGCCTCA CCCTGACCTG TGAGC3CAGAG AGTAGCCAGG 1140 

ACCTCGAGTT CCAGTGGCTG AGAGAftGAGA CAGACCAGGT GCIGGAAAGG GGGCCTGTGC 1200 

TTCAGTTGCA TGACCTOAAA CGGGAfiGCAG GAGGOGGCTA TCGCT GCGTG GOGTCTOTGC 1260 

CCAGCATACC CGGCCTGAAC CGCACACAGC TGGTCAAGCT GGCXATTTTT GGCCCCCCTT 1320 

GGATGGCATT CAAGGAGAGG AAGGTGTGGG TGAAAGAGAA TATGGTGTTG AATCTGTCTT 1380 

GTGAAGCGTC AGGGCACCCC CGGCCCACCA TCTCCTGGAA CGTCAACGGC ACGGCAAGTG 1440 

AACAAGACCA AGATCXaCAG OGAGTCCTGA GCACCCTGAA TGTCCTCGTG ACCCOGGAGC 1500 

TOTTGGAGAC AGGTGTrGAA TGCAOGGCCT CCAACGACCT GGGCAAAAAC ACCAGCATCC 1560 

TCTTCCTCGA GCTGGTCAAT TTAAOCACCC TCACACCAGA CTCCAACACA AOGACTGGCC 1620 

TCAGCACTTC CACTGCCAGT CCTCATACCA GAGCCAACAG CACCTOCACA GAGAGAAAGC 1680 

TGCOGGAGCC GGAGAGOXG GGOGTGGTCA TCGTGGCTGT GATTGTGTGC ATCCTGGTCC X740. 

TGGOGGTGCT GGGCGCTGTC CTCTATTTCC TCTATAAGAA GGGCAAGCTG COGTGCAGGC 1800 

GCTCAGGGAA GCAGGAGATC A(XXrrGCCCC CXTTCTCXSTAA GACCGAACTT GTAGTTGAAG 1860 

TTAAGTCAGA TAAGCTCCCA GAAQAGATGG GCCTCCTGCA GGGCAGCAGC GGTGACAAGA 1920 

GGGCTCCGGG AGACCAfiGGA GAGAAATACA TOGATCTGAG GCATTAGCCC CQAATCACTT 1980 

CAGCTCCCTT CCCTGCCTGG ACCATTCCCA GCTCCXTTGCT CACTCTTCTC TCAGCCAAAG 2040 

CCTCCAAAGG GACTAGAGAG AAGCCTCCTG CTCCCCTCAC CTGCACACCC CCTTTCAGAG 2100 

GGCCACTGGG TTAGGACCTG AGGACCTCAC TTGGCCCTGC AAGCCGCTTT TCAGGGACCA 2160 

GTCCACCACC ATCTCCTCCA OGTTGAGTGA AGCTCATCCC AAGCyVAGGAG CCCCAGTCTC 2220 

CCGAGOGQGT AQGAGAGTTT CTTGC»GAAC GTGTTTTTTC TTTACACACA TTATGGCTGT 2280 

AAATACCIGG CTCCTGCCAG CAGCTGAGCT GGGTAGCCTC TCTGAGCTOG TTTCCTGCCC 2340 

CAAAGGCTOG CTTCCACCAT CCAGGTGCAC CACTGAAGTG AG6ACACACC GGAGCCAGGC 2400 

GCCTCCTCAT GTTGAAGTGC GCTCTTCACA CCCXSCTCOSG AGAGCACCCC AGCQGCATCC 2460 

AGAAGCAGCT GCAGTGTTGC TGCCACCACC CTCCTGCTCG CCTCTTCAAA GTCTCCTGTG 2520 

ACATTTTTTC TTTGGTCAGA AGCCAGGAAC TGGTGTCATT CCTTAAAAGA TACGTGCCGG 2580 

GGCCAGGTGT GGTGGCTCAC GCCTGTAATC CCAGCACTTT GGGAGGCCGA GGCGGGCGGA 2640 

TCACAAACTC AGGACGAGAC CATCCTGGCT AACACGGTGA AACCCTGTCT CTACTAAAAA 2700 

TACAAAAAAA AATTAGCTAG GOGTAGTGGT TGGCACCTAT AGTCCCAGCT ACTCG6AAGG 2760 

CTGAAGCAGG AGAATGGTAT GAATCCAGGA GGTGGAGCTT GCAGTGAGCC GAGACCGTGC 2820 

CACTGCACTC CAGCCTGGGC AACACAGCGA GACTCCGTCT CGAQGAAAAA AAAAGAAAAG 2880 

ACGCGTACCT GOSGTGAGGA AGCTGGGCGC TGTTTTCGAG TTCAGGTGAA TTAGCCTCAA 2940 

TOCCCGTGTT CACTTGCTCC CATAGCCCTC TTGATGGATC AOGTAAAACT GAAAGGCAGC 3000 

GGGGAGCAGA CAAAGATGAG GTCTACACTG TCCTTCATGG GGATTAAAGC TATGGTTATA 3060 

TTAGCACCAA ACTTCTACAA ACCAAGCTCA GGGCCCCAAC CCTAGAAGGG CCCAAATGAG 3120 

AGAATGGTAC TTAGGGATGG AAAAOGGGGC CTGGCTAGAG CTTCGCGTGT G TOTGT CTGT 3180 

CTGTGTGTAT GCATACATAT GTGTCTATAT ATGGTTTTGT GAGGIGTGTA AATTT6CAAA 3240 

TTGTTTCCTT TATATATGTA TGTATATATA TATATGAAAA TATATATATA TATGAAAAAT 3300 

AAAGCTTAAT TCTCCCAGAA AATCATACAT TGCTTTTTTA TTCTACATGG GTACCACAGG 3360 

AACCTGGGGG CCTGTCAAAC TACAACCAAA AGGCACACAA AACCGTTTCX: AGTTGGCAGC 3420 

AGAGATCAGG GGTTACCTCT GCTTCTGAGC AAATGGCTCA AGCTCTACCA GAGCAGACAG 3480 

CTACCCTACT TTTCAGCAGC AAAACGTCCC GTATGACGCA GCAOCSAAGGG CCTGGCAGGC 3540 
TGTTAGCAGG AGCTATGTCC CTTCCTATCG TTTCCGTCCA CTT 



Seq ID NO: 120 Protein sequence: 
Protein Accession S: NP_006491.l 



1 11 21 31 41 51 

ilGLPRLVCAF LLAACCCCPR VAGVPGEAEQ PAPELVEVEV GSTALLKCGL SQSQGNLSHV 60 

DWFSVHKBKR TLIFRVRQGQ GQSEPGEYEQ RLSLQDRGAT LALTQVTPQD ERIFLCQGKR 120 

PRSQBYRIQL RVYKAPBEPN IQVMPLGIPV NSKBPEEVAT CVGHNGYPIP QVIWYKNGRP 180 

LKEBKNRVHI QSSQTVBSSG LYTLQSILKA QX.VKEDXDAQ FVCEUIYllIiP SGMHMKESRE 240 

VTVPVFYPTE KVWLEVSPVG MLKEGDRVEI RCLADGMPPP HPSISKQNPS TREAEBBTTN 300 

ONGVLVLEPA RKEHSGRYEC QAWNLDTMIS UiSEPQELLV NYVSDVRVSP AAPERQEGSS 360 

LTLTCEAESS QDLEFQWLRE ETDQVLERGP VLQLHDLKRE AGGGYRCVAS VPSIPGLNRT 420 

QLVKLAIFGP PWMAFRERKV WVKENMVLNL SCEASGHPRP TISWNVNGTA SEQDQOPQRV 480 

IiSTUmjVTP BLLBTGVECT ASNDLGKNTS ILFLEIiVNLT TLTPDSNTTT GLSTSTASPH 540 

TRAHSTSTER KLPBPBSRGV VIVAVIVCIL VIAVLGAVLY FLYKKGKLPC RRSGKQEITL 600 
PPSRKTBIiW EVKSOKLPEE MGLXiQGSSGD KRAPGDQGEK YIDLRH 

Seq ZD KO: 121 DNA sequence 
Nucleic Acid Accession fit NM_018306 
Coding sequence: 60-671 

1 11 21 31 41 51 

ATAGTCTACA CAGAGCTCCC CTTGCTGCCC AGACAAGCTG AAGGACCACA GGAAAAGCCA 60 

TGGAGACTTC AGCATCCTCC TCCCAGCCTC AGGACAACAG TCAAGTCCAC AGA GAAACAG 120 

AAGATGTAGA CTATGGAGAG ACAGATTTCC ACAAGCAAGA CGGGAAGGCT GGACTCTTTT IBO 

CCCAAGAACA ATATGAGAGA AACAAGTCTT CTTCCTCCTC CTTCTCTTOC TC CTCATCCT 240 

OCTCATCTTC TTCATCCTCC TCCTCCTCAG GTOCTGQGCA TGGGGAGCCT GAOGTTTTGA 300 
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AGGATCAGCT TWCTCIAT GGAGATGCTC CTGGAGAGGT GGTACCCTCT GGGGAATCAG 360 

GACTCOGAAG GAGAGGCTCT GACCCAGCAA GTGGAGAAGT GGAGGCCTCT CAGTT AAGAA 420 

GACTOAATAT AAAGAAAGAT GATGAGmT TCCATTtCGT CCTCCTGTGC TTTTCCWCG 480 

GGGCCTTGCT GGTX3TGTTAT CACTATTACG CAGACTGGTT CATGTCrCTT GGGGTCGGOC S40 

TGCTCACCTT CGOCTCCCTG GAAACCGTTG GCATCTACTT OSGACTAGTG TACOGTATCC 600 

ACAGOCSTCCT CCAAGGCTTC ATCCXrCTCT TGCAGAAGTT TAGGCTGACA GGGTTCftGGA 660 

AGACTCACTG AGGCCACTTC CAGGTCGGCA GCAGAGGCAG GCCCCAGTGT GACCACCACT 720 

GOGACCXXnx; AGCCCACAAG GGCftCAGCftC CATTCrowa GACGCACAGG W»COjAGCC 180 

AGACCAATAA ACAGAACACT TTTCCTTa^ TGTGGTCTGA ATGTTGGCAC CAGCCOCWGC 840 

AGGGGCATCT CATTTCGGCA GTACTCCTGT GCAACCCAGC TGC3UW3GATG GAAGGCAOJG 900 

GGTGGGTCTC GGGCCTGAGG CTTCACAGTA CCTGGACCAG CAQGAAGATT CTGOaGGTC 960 

ACTGCTCTCA GAGGACAGCA AGGGACCCTG AGCTCTGCAA GCTGTGATCT GTCTGGGTTC 1020 

ATGGTTTTTC TCAAATCCCA GGCTATCTGC ATGOGCTCTC AGGTGCTACC GAGCCATCCT 1080 

GATOGTCCAC TGCTTTGftCG CACGGAGCCA TCGGGCTGGG GCCCCTTGGT 1140 

GAACCTGATG CAGGTAAGAT GCTGAGGACT AAAACCATTT rTTTTGCACC CAAAAflAAAA 1200 

GGCAGGAAAA TGATCATCAG AAACTAAATG GCAGCCAGGC ATGGGGGCTC ACGACTGTAA 1260 

TCCTCGCACT TTGGGAGGCT CAGGCTAAGG GTCGCTTGAA GCTCAGACTT CAAGACCT^ 1320 

CTGGGCAACA TAGTGAGACC CCCATCTCTA CAATTTTTTT TTAATGACCA AATGTGGCGG 1380 

TACATACCTG TACATACCTG CGGTTCCAGC TACTCAAGAG GCTGAGGCAG GAGGACTGCT 1440 

TGAGCCCAGG AGTTCAGGGC TGCAGTGAGG TAC3GATCAAG CCACTGCACT CCAGCCTGGG 1500 
OGAGAGAGCA AGATOGTTTC TCTAAAATT 



Seq ID KO: 122 Protein sequence: 
Protein Accession ft: NP_060')76 

1 11 21 31 41 51 

METSASSSQP QDNSOVHRET EDVDYGETDP HKQD6KAGLF SQBQYERMKS SSSSPSSSSS 
SSSSSSSSSS GPGHGEPDVL KDBLQLYGDA PGEWPSGBS GLRRRGSDPA SGEVEASQLR 
aUIIKKDDBF FHFVLLCFAI GALLVCYHYY ADHFMSU3VG LLTPASLBTV GlYFCLVYRI 
aSVLQGFIPL FQKFRliTGFR RTD 



60 
120 
180 



Seq ID NO: 123 DNA sequence 
Nucleic Acid Accession #: BC022S42 
coding sequence: 24 3.. 8 96 

\ V T r 1^ r 

ACTTGGTCCC AGCCGATAAA TCTGGGGCAG CGCGCGGTAG GAGCTGCGGG CGGCCAGGCC 
CCTTCCTGCG TCCXSCACCTG GCCCX»OGCG CXXCTCTCGG GCGTCOSGCT TOOGGOGTCC 
TGGCGGCTCG GGTGGCGGCG GTTCGGGOGG CCGCCTGGCT GCTCCTOGGG GCXSGOSACGG 
GGCTCACGCG CGGGCCCGCC ACGGCCTTCA CCGCCGCGCG CTCTGACGCC GGCATAAGGG 
CCATGTOTPC TGAAATTATT TTGAGGCAAG AAGTTTTGAA AGATGGTTTC CACAGAGACC 
TTTTAATCAA AGTGAAGTTT GGGGAAAGCA TTGAGGACTT GCACACGTGC CGTCTCTTAA 
TTAAACAGGA CATTCCTGCA GGACTTTATG TGGATCCGTA TGAGTTGGCT TCATTACGAG 
AGAGAAACAT AACAGAGGCA GTGATGGTTT CAGAAAATTT TGATATAGAG GCCCCTAACT 
ATTTGTCCAA GGAGTCTGAA GTTCTCATTT ATGCCAGACG AGATTCACAG TGCATTGACT 
GTTTTCAAGC CTTTTTGCCT GTGCACTX3CC GCTATCATCG GCCGCACAGT GAAGATGGAG 
AAGCCTCGAT TGTGGTCAAT AACCCAGATT TGTTGATGTT TTGTGACCAA GAGTTCCOGA 
TTTTGAAATG CTGGGCTCAC TCAGAAGTGG CAGCCCCTTG TGCTTTGGAT AATGAGGATA 
TATGCCAATG GAACAflGATG AAGTATAAAT CAGTATATAA GAATGTGATT CTACAAGTTC 
CAGTGGGACT GACTGTACAT ACCTCTCTAG TATGTTCTGT GAC TCTGCT C ATTACAATCC 
TGTGCrCTAC ATTGATCCTT GTAGCAGTTT TCAAATATGG CCATTTTTCC CTATAAGTTT 
TATGTAGTTA AATGCTTCCT AGAAACCTAA ATAAGATCTA TTAATTTCTG ACGAGAOGTG 
TTCTTCTAGA ATTAATTACT TTTATCTTTT GTCTTCATTT GTGGCCAAAA TTATGTTTAC 
TAGAGGAAAT TTGGGATCAT TCTCAGCTAA TTCCAAAATG TAGTGCTCTA TTGCATGGAT 
CCTTCGTAAT CCTCAAGCAT CAGATGCCAT AAGGGGAAAC TTAATTCTGC TAAATTAATG 
TTTArrrTGT GAGAAGTGAC TTTATCTTCA TTTGGGGTAG AAAAATTATT TCTTTATGTA 
GTAGAGACSWl ATCATTCTCA TTTTGCAAGT ACTTTCAATT TAAGCTACAA ATTGAGAAAA 
OQGTTATAAA TAAGAATAAA ATAGGOCAGG CACAGTGGCT CACACCTGTA ATCCCAGCAC 
TTTGGGAGGC CGAGGTGGGC GGATCACCAG AGGTCAAGAG TTTGAGACCA GCTTGGTGAA 
ACCCTGTCTC TACTAAAAAT ACAAAAGTTA GCTGGGGCTG GTGGTGGGCA TCTGTAGTCC 
CAGCTAATTC GAAGGGTGAG GCGGGAGGAT CX3CTTGAACC TGGGAGGCGG AGGTTCCAGA 
GAGCCAAOAT CGCACCACT6 CACTACAGCC TGGGCGACAG AACGAGACCC TGTCTCCAAA 
GGAAAAACAA AAAAGAAGAA TAAAATAATT TG6ATGAAAA TCATGTTTAT TTAAATAGTA 
ATGTCATGAG ACTATTAAAG ATGTGCCAGA GTTTCAATGA AAATCATTAA AGTAGGACAG 
CTAAGAAATT AATATTAATA TAAAAATTAT TGATAATCTT AAATTATTGA TTATTOCTTA 
ACGCACTCCA TTCTCCTTTT ACATTTTATC ATGTTTCTTT TGAATATATG AATTGGCAAA 
GGACTTGATG AAACTGAGTA CTAAGATTTG GTACA6AGTA TGTCAGGAAG ACAACTCAGA 
TTGCCATTTT AAATAAAGTT GTACATGAAC AAAAAAAAAA AAAAAA 
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Seq ID HO I 124 Pxocein sequence: 
Protein Accession ft: AAH22542 

1 11 21 31 41 51 

KCSBIILRQE VLKDGFHRDL LIKVKFGESI EDLHTCRLLI KQDIPAGLYV DPVELASLRE 60 

RKITEAVMVS ENFDIEAPNY LSKESEVLIY ARRDSQCIDC FQAPLPVHCR YHRPHSEDGB 120 

ASIWNNPDL LMFCDQAGSR RMIRFRFDSF DKTIEFPXLK CWAHSEVAAP CALENEDIOQ 180 
WNKMKYKSVY KNVILQVPVG LTVHTSLVCS VTI.LITILCS KXXKX 

Seq ID NO: 125 I»JA Sequence 

MUcIeic Acid Accession #: NM_004994.l 

Coding sequence: 20.. 214 3 
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1 11 21 31 41 51 

I 1 I i I I 

AGACAOCTCT GCCCTCAOCA TGAGGCTCT6 GCAGGCXXTEG GTCCTGGTGC TCCTTOTGCT 60 

GGGCTGCTGC TTTGCTGCCC CCAGACAGOG CCAGTCC A OC CiltSTGCTCT TCCCTGGAGA 120 

CCTGAJGAACC AATCTCACCG ACAOSCAGCT GGCaCAfiGAA TACCTGTACC GCTAJrGGTTA 180 

CACTOSGCTG GCAGAGATGC GTGGAGAGTC GAAATCTCTG GGGCCTGCGC TGCTGCTTCT 240 

CCAGAAGGAA CTGTCCCrGC COGAGACCGG TGAGCTC3GAT AGCGCCAOGC TGAAGGCCAT 300 

GOGAAGCCCA OGGTGCGGGG TCCCAGACCT GGGCAGATTC CAAACCTTTG AGGGCGACCT 360 

CAAGTOGOVC CACCA^QKAOV TCACCTATTG GATCCAMVAC TACTOGGAAG ACTTGCOGCG 420 

GGCGGTCATT GACGACGCCT TTGCCOGCGC CTTOGCACTG TGGAG OGOGG TGAOGCOGCT 480 

CACCTTCACT CGOGTGTACA GCOGGGAOGC AGACRTCXSTC ATCOVGrtTG GTGTCGCGGA 540 

GCACGGAGAC GGGTATCCCT TCGACGGGAA GGAC3GGGCTC CTGGCACACG CCTTTCCTCC 600 

TGGCCCCGGC ATTCAGGGAG ACGCCCATTT OGAOGATGAC GAGTTGTQGT CCCTGGGCAA 660 

GGGCGTCGTG GTTCCAACTC GGTTTGGAAA CGCAGATGGC GCGGCCTGCC ACTTCCCCTT 720 

CATCTTOGAG GGCCGCTCCT ACTCTGCCTG CACCACCGAC GGTOGC TCOG A CGGCT TGCC 730 

CTGGTGCAOT ACCAOGGCCA ACTACGACAC CGAOGACOGG TTTGG CTTC T GCCCCAGOGA 840 

GAGACTCTAC ACC0GG6A0G GCAATGCTGA TGGGAAACCC TGCCAGTTTC CATTCATCTT 900 

CCAAGGCCAA TCCTACTCCG CCTGCACCAC GGAOGGTOGC TCOSACGGCT ACOGCTGGTG 960 

CGCCACCACC GCCAACTACG ACOGGGACAA GCTCTTCGGC TTCTGCCOGA CCCGAGCTGA 1020 

CTCGACGGTG ATGGGGGGCA ACTCGGCGGG GGAGCTGTGC GTCTTCCCCT TCACTTTCCT 1080 

GGGTAAGGAG TACTCGACCT GTACCAGCGA GGGCCGCGGA QATCGGCGCC TCTGGTGCGC 1140 

TACCACCTOG AACTTTGACA GCGACAAGAA GTGGGGCTTC TGCCCGGACC AAGGATACAG 1200 

TTTGTTCCTC GTGGOGGCGC ATGAGTTCGG CCAOGCGCTG GGCTTAGATC ATTOCTCAGT 1260 

GCCX3GAGGCG CTCATGTACC CTATGTACOG CTTCACTGAG GGGCCCCCCT TGCATAAGGA 1320 

CGACGTGAAT GGCATCCGGC ACCTCTATGG TCCTOGCCCT GAACCTGAGC CACGGCCTCC 1380 

AACCACCACC ACACCGCAGC CCACGGCTCC CCOGAOGGTC TGCCCCACCG GACCCCCCAC 1440 

TGTCCACCCC TCAGAGCGCC CCACAGCTGG CCCCACAGGT CCCCCCTCAG CTGGCCCCAC ISOO 

AGGTOCCCCC ACTGCTGGCC CTTCTACGGC CACTACTGTG OCTTTGAGTC CGGTGGAOGA 1560 

TGOCTGCAAC GTGAACATCT TCGA06CCAT CGOGGAGATT GGGAACCftGC TGTA ITTGTT 1620 

CAAGGATGGG AAGTACTQGC GATTCTCTGA GGGCAGGGGG AGCOGGCCGC AGGQCOCCTT 1680 

CXTTTATCGCC GACAAGTGGC COGCGCTGCC COGCAAGCTQ GACTCGGTCT TTGAGGAGCC 1740 

GCTCTCCAAG AAGCTTTTCT TCTTCTCTGG GCGCCaGGTG TGGGTGTACA C3U3GCGCXyrC 1800 

GGTGCTGGGC COGAGGOGTC TGGACAAGCT GGGCCTGGGA GCCGACGTGG CCCAGGTGAC 186.0 

OGGGGCCCTC OGGAGTGGCA GGGGGAAGAT GCTGCTGTTC AGCGGGCGGC GCCTCTGGAG 1920 

GTTOGACGTG AAGGC6CAGA TGGTGGATCC C0GGAGC3GCC AGCX3AGGTGG ACCGGATGTT 1980 

CCCCGGGGTG CCTTTGGACA CGCACGACGT CTTCCAGTAC OGAGAGAAAG CCTATTTCTG 2040 

CCAGGACCGC TrCTACrGGC GCGTGAGTTC CCGGAGTGAG TTGAACCAGG TGGAC CAAGT 2100 

GGGCTACGTG ACCTATGACA TCCTGCAGTG CCCTGAGGAC TAGGGCTCCC GTCCTGCTTT 2160 

GCAGTGCCAT GTAAATCCXX ACTGGGACCA ACCCTGGGGA AGGAGCCAGT TTGCCGGATA 2220 

CAAACTGGTA TTCTGTTCTG GAGGAAAGGG AGGAGTGGAG GTGGGCTGGG CCCT CTCTTC 2280 
TCACCTTTCT TTTTTGTTGG AGTGTTTCTA ATAAACTTGG ATTCTCTAAC CTTT 



Seq ZD NO: 126 Protein sequence: 
Protein Accession «: MP_004985.1 

1 11 21 31 41 51 

I 1 t 1 I.I 

MSIiWQPLVLV LLVLGCCFAA PRQRQSTLVL PPGOLRTNLT DRQLAEEYLY RYGYTRVAEM 60 

RGESKSLGPA LLLLQKQLSL PETGELDSAT LKAMRTPRCX5 VPDLGRFQTP EGDIjKWHHHN 120 

ITYWIQNYSE DLPRAVIDDA FARAFAIiHSA VTPLTFTRVY SItSADIVZQF GVAEB GDGY P 180 

PDGKDGIiLAH AFPPGPGIQG DAHFDDDELW SLGKGVWPT REX3NADGAAC HFPPIPEGRS 240 

YSACTTDGRS DGLPWCSTTA NYDTDDRFGP CPSERLYTRD GNADGKPCQF PFIFQ6QSYS 300 

ACTTDGHSDG YRWCATTANY DRDKLFGFCP TRADSTVKGG NSAGELCVPP PTFLGKEYST 360 

CTSEGRGDGR LWCATTSNFD SDKKWGFCPD QGYSLPLVAA HEFGHALGLD HSSVPEALMY 420 

PMYRFTBGPP LHKDDWGIR KLYGPRPEPE PRPPTTTTPQ PTAPPTVCPT GPPTVHPSBR 480 

PTAGPTGPPS AGPTGPPTAG PSTATTVPLS PVDDACNVNI FDAXASI07Q I.YI*PKDGKYW 540 

RPSEGRGSRP QGPFLIADKW PALPRKLDSV FEEPLSKKLF FFSGRQVWVY TGASVLGPRR 600 

LDKLGLGADV AQVTGALRSG RGKMLLFSGR RLWRFDVKAQ MVDPRSASEV DRMFPGVPIiD 660 
THDVFQYREK AYFCQDRFYW RVSSRSELNQ VDQVGYVTYD ILQCPED 



Seq ZD HOi 127 DNA sequence 
Nucleic Acid Accession #: NM_004181 
Coding sequence: 32-670 



1 11 21 31 41 51 

i i 1 I i I 

GCAGAAATAG CCTAGGGAGA TCAACCCCGA GATGCTGAAC AAAGTGCTGT CCCGGCTGGG 60 

GGTCGCCGGC CAGTGGGGCT TGGTGGAC6T GCTGGGGCTG GAA6AGGA6T CTCTOOGCTC 120 

GGTGCCAGCG CCTGCCTGOG CGCTGCTGCT GCTGTTTCCC CTCACGGCCC AGCATGAGAA 180 

CTTCAGGAAA AAGCAGATTG AAGAGCTGAA GGGACAAGAA GTTAGTCCTA AAGTGTACTT 240 

CATGAAGCAG ACCATTGGGA ATTCCTGTGG CACAATCGGA CTTATTCACG CAGTGGCCAA 300 

TAATCAAGAC AAACTGGGAT TTGAGGATGG ATCAGTTCTG AAACAGTTTC TTTCTGAAAC 360 

AGAGAAAATG TCCCCTGAAG ACAGAGCAAA ATGCTTTGAA AAGAATGAGG CCATACAGGC 420 

AGCCCATGAT GCCGTGGCAC AGGAAGGCCA ATGTCGGGTA GATGACAAGG TGAAT TTCCA 480 

TTTTATTCTG TTTAACAACG TGGATGGCCA CCTCTATGAA CTTGATGGAC GAATGCCTTT 540 

TCCXK3TGAAC CATGGCGCCA GTTCAGAGGA CAGOCTGCTG AAGGACGCTG CCAAGGTGTG 600 

CAGAGAATTC ACCGAGCGTG AGCAAGGAGA AGTCCX3CTTC TCTGCCGTGG CTCTCTGCAA 660 

GGCAGCCTAA TGCTCPGTGG GAGGGACTTT GCTGATTTCC CCTCTTCCCT TCAACATGAA 720 

AATATATACC CCCCATGCAG TCTAAAATGC TTCAGTACTT GTGAAACACA GCrGTTCTTC 780 

TGTTCTGCAG ACACGCCTTC CCCTCAGCCA CACCCAGGCA CTTAAGCACA AGCAGAGTGC 840 

ACAGCTGTCC ACTGGGCCAT TGTGGTGTGA GCTTCAGATG GTGAAGCATT CTCCCCAGTC 900 

TATGTCTTGT ATCCGATATC TAACGCTTTA AATGGCTACT ri'GGTTTCT G TCTGTAAGTT 960 
AAGACCTTGG ATGTGGTTAT GTTGTCCTAA AGAATAAATT TTGCTGATAG TAGC 



seq ID KO: 128 Protein sequence*. 
Protein Accession ft: NP_004172 
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MliKKVLSRLG VAXSQNRFVSV tGLEEESLGS VFAPACALZiL LFPLtAOaES FRK KQIBE LK 60 

GQEVSPKVYP MKQTIGN'SCG TlGLIHAVftN BQDKLGPEDG SVLKQFLSBT EKMSPEDRAX 120 

CFBKNEMQA AHDAVAQBGQ CRVEOKVOTH FILFNNVDGH LYELDGBMPP PVHHGASSED 180 
TLIiKDAAICVC REFTEREQGB VRFSAVALCK AA 



Seq ID NO: 129 sequence 
Nucleic Acid Accession »i IIM_000213 
Coding sequence: 127-5385 

1 11 21 31 41 51 

I i I 1 i ^ 

CGCCCGCGCG CTGCAGCCCC ATCTCCTAGC GGCAGCXX^G GCGCGGAfiGG AGOGAGTCOG 60 

CCCCGAGGTA GGTCCAGGAC GGGCGCACAG CAGCAGCCGA GGCTGGCOGG GA GAGGG AGG 120 

AAGAGGATGG CAGGGCCAOG CCCCAGCCCA TGGGCCACGC TGCTCXTIGGC AGCCTTGATC 180 

AGCGTCAGCC TCTCTCGGAC CTTGGCaUAC CGCTGCAAGA AGGCCCCAGT GAAGAGCTGC 240 

ACGGAGTGTG TCCGTGTGGA TAAGGACTGC GCCTACTGCA CAGAOSAGAT GTTCAGGGAC 300 

OGGOSCroCA ACACCCAGGC GGAGCTGCTG GCCSGOSGGCT GCCAGOGGGA GAGCATOGTG 360 

GTCATGGAGA GCAGCTTCCA AATCACAGAG GAGACCCAGA TTGACACCAC CCTGCG GCGC 420 

AGCCAGATGT CCCCCCAAGG CXnGOGGGTC CGTCTGOGGC COGGTGAGGA GCGGCATTTT 480 

GAGCTGGAGG TCTTTGAGCC ACTGGAGAGC CCCGTGGACC TGTACATCCT CMGGj^TTC 540 

TCCAACTCCA TGTCCGATGA TCTGGACAAC CTCAAGAAGA TGGG GCAGAA CCTOGCTOGG 600 

GTCCTGAGCC AGCTCACCAG CGACTACACT ATTGGATTTG GCAAGTTTGT GGACAAAGTC 660 

AGGGTCCCGC AGACGGACAT GAGGCXTTGAG AAGCTGAAGG AGCCCTGGCC CAACAGTGAC 720 

COCCCCTTCT CCTTCAAGAA OGTCATCAGC CTGACAGAAG ATGIGGATGA GTTCOGGAAT 780 

AAACTGCAGG GAGAGOGGAT CTCAGGCAAC CTGGATGCTC CTGAGGGOSG CTTCGATCCC 840 

ATCCTGCAGA CAGCTGTGTG CAOGAGGGAC ATTGGCTGGC GCCOGGACAG CACOCACCTG 900 

CTGGTCTTCT CCACOGAGTC AGCCTTCCAC TAT6AGGCTG ATGGOGCCAA 0GTGCTG6CT 960 

GGCATCATCA GCCGCAACGA TGAACGGTGC CACCTGGACA CCACGGGCAC CTACACCCAG 1020 

TACAGGACAC AGGACTACCC GTCGGTGCCC AOCCTGGTGC GCCTGCTCGC CAAGCACAAC 1080 

ATCATCCCCA TCTTTGCTOT CACCAACTAC TCCTATAGCT ACTACGAGAA GCTTCACACC 1140 

TATTTCCCro TCTCCTCACr GGGGGTGCTG CAGGAGGACT CGTCCAAC3^T CGTGGAGCTG 1200 

CTGGAGGAGG CCTTCAATC3G GATCCGCTCC AACCTGGACA TCCGGGCCCT AGACAGCCCC 1260 

CGAGGCCrrC GGACAGAGGT CACCTCCAAG ATGTTCCAGA AGACGAGGAC TGGGTCCTTT 1320 

CACATCCGGC GGGGGGAAGT GGGTATATAC CAGGTGCAGC TGCGGGCCCT TGAGCACGTG 1360 

GATGGGAOGC ACGTGTGCCA GCTGCCGGAG GACCAGAAGG GCAACATCCA TCTGAAACCT 1440 

TCCTTCTCCG AOSGCCTCAA GATGGACGOG GGCATCATCT GTGATGTGTG CACCTGCGAG ISOO 

CTGCAAAAAG AGGTGCGGTC AGCTCXK:tGC AGCTTCAACG GAGACTTCGT GTGCGGACAG 1S60 

TGTGTCTGCA GOGA6GGC«5 GAGTGGCCAG ACCTGCAACT GCTCCACCGG CTCTCTGAGT 1620 

GACATTCAGC CCTGCCTGCG GGAGGGCGAG GACAACCCX3T GCTCCQGCOB TGGGGAfllGC 1680 

CAGTGCGGGC ACTGTGTGTG CTACGGOGAA GGCCGCTAC3G AGGGTCAGTT CTGCQAGTAT 1740 

GACAACTTCC AGTGTCCCCG CACTTCCGGG TTCCTCTGCA ATGACOGAGG AOSCTGCTCC 1800 

ATGGGCCAGT GTGTGTGTGA GCCTGGTTGG ACAGGCCCAA GCTGTGACTG TCCCCTCAGC 1860 

AATGCCACCT GCATCGACyVG CAATGGGGGC ATCTGTAATG GAaTTGGCCA CTGTGAGTGT 1920 

GGCQGCTGCC ACTGCCACCA GCAGTCGCrC TACAOSGACA CCATCTGOGA GATCAACTAC 1980 

TOSGCGATCC ACXXX5GGCCT CTGCGAGGAC CTACSGCTCCT GOGTGCAGTG OCAGGCGTGG 2040 

GGCACCGGCG AGAAGAAGGG GCGCACGTGT GAGGAATGCA ACTT CAAGg T CAAGATGGTG 2100 

GACGAGCTTA AGAGAGCX3QA GGAGGTGGTG GTGCX3CTGCT CCTTOOGGGA OSAGG ftaGAC 2160 

GACTGCACCT ACAGCTACAC CATCGAAGGT GACX3GCGCCC CTGGGCCCAA CAGCACEGTC 2220 

CTCGTGCACA AGAAGAAGGA CTGCCCTCCG GGCTCCTTCT GGTGGCTCAT CCCCCTGCTC 2280 

CTCCTCCTCC TCCC3GCTCCT GGCCCTGCTA CTGCTGCTAT GCTGGAAGTA CTG TGCC TGC 2340 

TGCAAGGCCT GCCTGGCACT TCTCCCGTGC TGCAACOGAG GTCACATQGT GGGCTTTAAG 2400 

GAAGACCACT ACATGCTOGG GGAGAACCTG ATGGCCTCTG AOCACTTGGA CAG6CCCATG 2460 

CTGCGCAGCG GGAACCTCAA GGGCCGTGAC GTGGTCOGCT GGAAGGTCAC C3UVCAACATG 2S20 

CAGCGGCCTG GCTTTGCCAC TCATGCCGCC AGCATCAACC CCACAGAGCT GGTGCCCTAC 2S80 

GGGCTGTCCT TGCGOCTGGC CCGCCTTTGC ACCGAGAACC TGCTGAAGCC TGACACTOGG 2640 

GAGTGCGCCC AGCTG03CCA GGAGGTGGAG GAGAACCTCA AOGAGGTCTA CAGGCAGATC 2700 

TOOGGTGTAC ACAAGCTCCA GCAGACCAAG TTCCGGCAGC AGCCCAATGC OGGGAAAAAG 2760 

CAAGACCACA CCATTGTCGA CACAGTGCTG ATGGOGCCCC GCTCGGCCAA GCCGGCCCTG 2820 

CTGAAGCTTA CAGAGAAGCA GGTGGAACAG AGGGCCTTCC ACGACXrrCAA GGTGGCCCCC 2880 

GGCTACTACA CX:CTCACTGC AGACCAGGAC GCCCXWGGCA TGGTGGAGTT OCAGGAGGGC 2940 

GTGGAGCTGG TGGACGTACG GGTGCCCCTC TTTATCCGGC CTGACGATGA OGACGAGAAG 3000 

CAGCTGCTGG TGGAGGCCAT CGACGTGCCC GCAGGCACTG CCAC CCTOG G CCGCCGCCTG 3060 

GTAAACATCA CCATCATCAA GGAGCAAGCC AGAGACX3TGG TGTOCTTTGA GCAGCXTTGAG 3120 

TTCTCGGTCA GCCGCGGGGA CCAGGTGGCC CGCATCCCTG TCATCCX3GCG TGTCCTGGAC 3180 

GGGGGGAAGT CCCAGGTCTC CTACOGCACA CAOOATGGCA COGOGCAGGG CAACCGGGAC 3240 

TACATCXrCOS TGGAGGGTGA GCTGCTGTTC CAGCCTGGGG AGGOCTGGAA AGAGCTGCAG 3300 

GTGAAGCTCC TGGAGCTGCA AGAAGTTGAC TCCXTTCCTGC GGGGCCGCCA GGTCCGCCGT 3360 

TTCX31CGTCC AGCTCAGCAA CCCTAAGTTT GGGGCCCACC TGGGCCAGOC CCACTCCACC 3420 

ACCATCATCA TCAGGGACCC AGATGAACTG GACCGGAGCT TCACX3AGTCA GATGTTGTCA 3480 

TCACAGCCAC CCCCTCACGG CGACCTGGGC GCCCCQCAGA ACCCCAATGC TAAGGCCGCT 3540 

GGGICCAGGA AQATCCATTT CAACTOGCTG CCCCCTTCTG GCAAGCCAAT GGGGTACAGG 3600 

GTAAAGTACT GGATTCAGGG TGACTCOSAA TCCGAAGCCC ACCTGCTCGA CAGCAAG6TG 3660 

CCCTCAGTGG AGCTCAGCAA CCTGTACCCG TATTGCGACT ATGA6ATGAA GGTGT60GGC 3720 

TACGGGGCTC AGGGCGAGGG ACCCTACAGC TCCCTGGTGT CCTGOOGCAC CCACCAGGAA 3780 

GTGCCCAGCG AGCCAGGGOG TCTGGCCTTC AATGTCGTCT CCTCCACGGT GACCCAGCTG 3840 

AGCTGGGCTG AGCOGGCTGA GACCAACGGT GAGATCACAG CCTACGAGGT CTGCTATGCC 3900 

CTGGTCAAOS ATGACAACOG ACCTATTGGG CCCATGAAGA AAGTGCTGGT TGACAACCCT 3960 

AAGAACOGGA TQCIGCTTAT TGAGAACCTT CGGGAGTOCC AGCCCTACCG CTACAOSGrG 4020 

AAGGOGCGCA ACGGGGCCGG CTGGGGGCCT GAGOGGGAGG CCATCATCAA CCTGGCCACC 4080 

CAGCCCAAGA GGCCCATGTC CATCCCCATC ATCCCTGACA TCCCTATCGT GGAOSCCCAG 4140 

AGCGGGGAGG ACTACGACAG CTTCCTTATG TACAGOGATG ACGTTCTAOG CTCTCCATOG 4200 

GGCAGCCAGA GGCCCAGCGT CTCOGATGAC ACTGAGCACC TGGTGAATGG CCGGATGGAC 4260 

TTTGCCTTOC CGGGCAGCAC CAACTCOCTG CACAGGATGA CCAOGACCAG TGCTGCTGCC 4320 

TATGGCACCC ACCTCAGCCC ACACGTGCCC CACCGOGTGC TAAGCACATC CTCCACCCTC 4380 
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ACAOGOGACT ACAACTCACT GACCCGCTCA GAACACTCAC ACTOGACCAC ACTGCOGAGG 4440 

GACCACTGCA CCCTCAOCTC OtSTCTCCTOC CAOGACTCTC GCCTGACTGC T GGTG TGCCC 4500 

GACAOGCCCA CCa S CCrGGT GTTCTCTGOC CIGGOGOOCA CATCTCTCAG fl GTGAGC TGG 4S60 

CAGGAGCOGC GGTGOGAGOG GCQGCTGCAG OGCTACAGTG TBGAGTACXA GCTGCIGAAC 4620 

GGOGGTGAGC TGCATOGGCT CAACATCCCC AACCCl-GCOC AGACCTCGGT GGTGGTQGAA 4680 

GACCTCCTGC CCAACCACTC CrAOGTGTTC CGCGTGOGGG CCCAGAGCCA GGAAGGCTGG 4740 

GGOCGAGAGC GTGAGGGTGT CATCACCATT GAATCCCAGG TGCACOOGCA GAGCCCACTG 4800 

TGIGCCCTGC CAGGCT006C C ' llXACm t j AGCACTOOCA GTGCCCCAGG CCOGCTGGTG 4860 

TTCACTGCCC TGAGCCCAGA CTOGCTGCAG CTGAGCTGQG A60GGCCACG GAGGCOCAAT 4920 

GGGGATATC3G TCGGCTACCT GGTOACCTGT GAGATGGCCC AAGGAGGAGG GCCAGCCACC 4980 

GCATTCCGGG TGGATGGAGA CAGCCCCGAG AGCOOGCTGA COGTGCOGGG CCTCACOGAG S040 

AA O G'ro C C C T ACAAGTTCAA GGTGCAGGCC AGGACCACTG AGGGCTTOGG GCCAGAGOGC 5100 

GAGOGCATCA TCACCATAGA GTCCCACGAT GGAC^ACCCT TOCCCCAGCT GGGCAGCOGT 5160 

GCCGGGCTCT TCCAGCACCX: GCTGCAAAGC CAGTACACCA GCATCACCAC CACCCACACC 522 0 

AGOGCCACOG AGCCCTTCCT AGTGGATGGG CCGACCCTGG GGGCCCAGCA CCPGGAGGCA 5280 

GGOGGCTCCC TCACCCGGCA TGTGACCCAG GRGTTTGTGA GCCOGACACT GACCACCAGC 5340 

GGAACCCTTA GCACCCACAT GGACCAACAG TTCTTCCAAA CTTGACOGCA CCCTGCCCCA 5400 

CCOCOGCCAT GTCCCACTAG GCGTCCTCCC GACTCCTCTC CCGGAGCCTC CTCAGCTACT 5460 

CCATCCTTGC ACCCCTGGGG GCCCAGCCCA CCCGCATGCA CAGAGCAGGG GCTAGGTGTC 5520 

TOCTGGGAGG CATGAAGGGG GCAAGGTCCG TCCTCTGTGG GCCCAAACCT ATTTGTAACC 5580 

AAAGAGCTGG GAGCAGCACA AGGACCCAGC CTTTGTTCTG CACTTAATAA ATGGTTTTGC 5640 
ACTG 



Seq ID NO: 130 Protein sequence: 
Protein Accession NP_000204 

1 11 21 31 41 51 

jlAGPRPSPWA RIjLLAALISV slsgtlanrc KKAPVKSCTE CVRVDKDCAY CTDEMFRDRR 60 

CKTQAELIiAA GCQRESIVVM ESSPQITEET QIDTTLRRSQ MSPQGLPVRI. RPGEERHPEL 120 

EVFBPLESPV DLYILMDFSN SMSODLDNLK KMGQNLARVL SQLTSDYTIG FGKFVDKVSV 180 

PQTDMRPEKL KEPWPNSDPP PSPKNVISLT EDVDEFRNKL QGEHISGNLD APEGGF DAIL 240 

QTAVCTRDIG HRPDSTHLLV FSTBSAFHVB ADGANVIAGI KSRNDERCHIi DTTGTYTQYR 300 

TQDYPSVPTL VRLLAKHNII PIFAVTNYSY SYYEKLHTYP PVSSLGVLQE DSSNIVELLE 360 

BAPNRIRSNL DIRALDSPRG LRTEVTSKMF QKTRTGSFHI RRGEVGIYQV QIiRAI^EHVDG 420 

THVCQLPEDQ KGNIHLKPSP SDGLKMDAGI ICDVCTCBLQ KBVRSARCSF NGDFVOSQCV 480 

CSBGWSGQTC NCSTGSLSOI QPCLHEGEDK PCSGRGEOQC GKCVCYGHGR YEGQFCEYDil 540 

EtJCPRTSGFL CNDRGRCSMG QCVCEPGWTG PSCDCPLSNA TCIDSNGGIC NGRGHCEOGR 600 

CBCKQQSLYT DTICEIMYSA IHPGLCEDLR SCVQCQAWGT GEKKGRTCEE CNFICVKMVDB 660 

LKRAEEVWR CSFRDEDDDC TYSYIMEGDG APGPNSTVLV HKKKDCPPGS FWWI,IPLLIiL 120 

LLPLLALLIiL LCWKYCACCK ACLALLPCCN RGHMVGFKED HYMIiRBMUIA SDHLDTPMLR 780 

SGNLKGRDW RWKVTNNMQR PGFATHAASI KPTELVPYGL St^RLARLCTB NLLKPDTREC 840 

AQLRQEVEEN LNEVYRQISG VHKLQQTKFR QQPNAGKKQD HTIVDTVLMA PRSAKPALLK 900 

LTEKQVEQRA FHDLKVAPGY YTLTADQDAR GMVEFQEGVE LVDVRVPLFI RPEDDDEICQL 960 

LVEAIDVPAG TATLGHRLVM ITIIKEQARD WSFEQPEPS VSRGDQVARI PVIRRVLDGG 1020 

KSQVSYRTQD GTAQQfRDYZ PVEGBLLFQP GSAWKELQVK LLELQEVOSZj LRGRQVRRFH 1080 

VQLSNPKFGA HLGQPHSTTI IIRDPDELDR SFTSQMLSSQ PPPHGDLGAP QMPMAKAAGS 1140 

RKIHFNWIiPP SCKPMGYRVK YWIQGDSESE AHLLDSKVPS VELTNLYPYC DYEMKVCAYG 1200 

AQGEGPySSL VSCRTHQEVP SEPGRLAFNV VSSTVTQLSW AEPAETNGEI TAYEVCYGLV 1260 

NDDNRPIGPM KKVLVDNPKN RMLLIENLRE SQPYRYTVKA RNGAGWGPER EAIINLATQP 1320 

KRPMSIPIIP DIPIVDAQSG EDYDSFLMYS DDVLRSPSGS QRPSVSDDTE HLVNGRMDFA 1380 

FPGSTNSLHR MTTTSAAAYG THLSPHVPHR VIiSTSSTLTR DYNSLTRSEH SHSTTLPRDY 1440 

STLTSVSSHD SRLTAGVPDT PTRLVPSALG PTSLRVSWQE PRCERPLQGY SVEYQIjLWGG 1500 

ELKRLNIPNP AQTSWVEDL LPNHSYVPRV RAQSQBGWGR EREGVITIES QVHPQSPLCP 1560 

LPGSAFTLST PSAPGPLVPT AI»SPOSLQIjS HERPRRPNGD IVGYLVTCEN AQGGGPATAF 1620 

RVDGDSPESR LTVPGLSEWV PYKFKVQART TB6FGPERBG ZITIESQDG6 PFPQLGSRAG 1680 

LFQHPIiQSBY SSFTTTBTSA TEPFLVDGPT LGAQHXiBAGG SLTRHVTQEP VSBTLTTSGT 1740 
LSTHMDQQPP QT 

Seq ID NO: 131 DI7A sequence 
Nucleic Acid Accession #: BC004372 
Coding sequence: 132.. 2231 

1 11 21 31 . 41 51 

I . I I i I ) 

CCTOGTGCCG CGGACCCCAG CCTCTGCCAG GTTOGGTCaS CCATCCTCGT CC06TCCTCC 60 

GCOGGCCCCT GCCCCGCGCC CAGGGATOCT CCAGCTCCTT TOGCCOGOGC OCTCOGTTCG 120 

CTCCXX3ACAC CATGGACAAG TTTTGGTGGC ACX3CAGCCT6 GGGACTCTGC CTCGTGCOGC 180 

TGAGCCTGGC GCAGATCGAT TTGAATATAA CCTGCCGCTT TGCAGGTGTA TTCCACGTGG 24 C 

AGAAAAATGG TOGCTACAGC ATCTCTOGGA CGGAGGCCGC TGACCTCTGC AAGGCTTTCA 300 

ATAGCACCTT GCCCACAATG GCCCAGATGG AGARAGCTCT GAGCATOGGA TTTGAGACCT 360 

GCftGGTATGG GTTCATAGAA GGGCATGTGG TGATTCCCOQ GATCCACOCC AACTCCATCT 420 

GTGCAGCAAA CAACACAGGG GTGTACATCC TCACATCCAA CACCTOCCAG TATGACACAT 480 

ATTGCTTCAA TGCTTCAGCT CCACCTGAAG AAGATTGTAC ATCAGTCACA GACCTGCCCA 540 

ATGCCTTTGA TGGACCAATT ACCATAACTA TTGTTAACOG TGATGGCACC 0GCTAT6TCC 600 

AGAAAGGAGA ATACAGAACG AATCCTGAAG ACATCTACCC CAGCAACCCT ACTCATGATG 660 

ACGTGAGCAO CGGCTCCTCC AGTGAAAGGA GCAGCACTTC AGGACGTTAC ATCTTTTACA 720 

CXTTTTTCTAC TGTACACCCC ATCCCAGACG AAGACAGTCC CTGGATCACC GACAGCACAG 780 

ACAGAATCCC TGCTACCAGT ACGTCTTCAA ATACCATCTC AGCAGGCTGG GAGCCAAATG 840 

AAGAAAAT6A AGATGAAAGA GACAGACACC TCAGTTTTTC TGGATCAGGC ATTGATGATG 900 

ATGAAGATTT TATCTCCAGC ACCATTTCAA CCACACCAOG G6CTTTTGAC CACACAAAAC 960 

AGAACCAGQA CTGGACCCAG TCGAACCCAA GCCATTCAAA TCOGGAAGTG CTACTTCAGA 1020 

CAACX:aCAAG GATGACTGAT GTAGACAGAA ATGGCACCAC TGCTTATGAA GGAAACTGGA 1080 

ACCCAGAAGC ACACCCTCCC CTCATTCACC ATGAGCATCA TGAG6AAGAA GAGACCCCAC 1140 

ATTCTACAAG CACAATCCAG GCAACTCCTA GTAGTACAAC GGAAGAAACA GCTACCCAGA 1200 

AGGAACA6TG 6TTTGGCAAC AGATGGCATG AGGGATATOG CCAAACAGCC AGAGAAGACT 1260 
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CCCftTTCGAC AACAGGGACA GCTGCAGCCT CAGCTCATAC CAGCCATCCA ATGCAAGGAA 1320 

C5GACAACA0C AAGCCCAGAG GACAGTTCCT GGACTGATTT CTTCAACCCA ATCTCACACC 13 BO 

CCATGGGACG AGC3TCATCAA GCAGGAAGAA GGATGGATAT GGACTOCAGT CATAGTACAA 1440 

CGCTTCAGCC TACTGCAAAT CCAAACACAG GTTTGCTGGA AfiATrTGGAC AfiGACAGGAC 1500 

CTCTTTCAAT GACAAOGCAG CAGAGTAATT CTCAGAGCTT CTCTACATCA CATGAAGGCT 1560 

TGGAAGAAGA TAAAGACCAT CCAACAACTT CTACTCTGAC ATCAA GCAAT ACCaATGATG 1620 

TCACAGGTG8 AAGAACAGAC CCAAATCATT CTGAACGCTC AACTACTTTA CTGGAAGGTT 1680 

ATACCTCTCA TTACCCACAC ACGAAGGAAA GCAQGACCrr CATCCCAGTG ACCTCAGCTA 1740 

AGACTGCGTC CTTTGGAGTT ACTGCAGTTA CTGTTGGAGA TTCCAACTCT AATGTCAATC 1800 

GTTCCTTATC ACGAGACCAA GACACATTCC ACOCCAGTGG GGGGTCCCAT ACCACTCATG 1860 

GATCTGAATC AGATGGACAC TCACATGGGA GTCAAGAAGG TGGAGCAAAC ACAACCTCTG 1920 

GTCCTATAAG GACACCCCAA ATTCCAGAAT GGCTGATCAT CTTGGCATCC CTCTTQQCCT 19B0 

TCGCTTTGAT TCrTGCAGTT TGCATTCCAG TCAACAGTCG AAGAAGGTGT GGGCACAAGA 2040 

AAAAGCTAGT GATCAACAGT GGCAATGGAG CTGTGGAGGA CAGAAAGCCA AGTGGACTCA 2100 

ACGGAGAGGC CAGCAAGTCT CAGGAAATGG TGCATTTGGT 6AACAAGGAG TCGTCAGAAA 2160 

CTCCAGACCA CTTTATGACA GCTGATGAGA CAAQ6AACCT GCAGAATgTG GAOVTGAAGA 2220 

TTGGGGTGTA ACACCTAC3VC CATTATCTTG GAAAGAAACA A COGfTGGAA ACATAACCAT 2280 

TACftGGGAGC TGGGACACTT AACAGATGCA ATGTGCTACr GATrGTPTCA TTGOGAATCT 2340 
TTTTTAGCAT AAAATTTTCT ACTCTTAAAA AAAAAAAAAA AAAAAAA 



Seq ID KO: 132 Protein sequences 
Protein Accession &: AAH04372 



1 11 21 31 41 51 

I I I i i I 

MDKFWWHAAW GLCLVPLSIA QIDLNITCRP A6VFHVEKNG RYSISRTEAA DLCKAFMSTI. 60 

PTMAQMBKAL SIGFETCRYG PIEGHWIPR IHPNSICAAW NTGVYILTSN TSQYDTYCFN 120 

ASAPPEEDCT SVTDLPMAFD GPITITIVNR DGTRWQKGE YRTRPEDIYP SNPTDDDVSS 180 

6SSSERSSTS GGYIFYTFST VHPIPDEDSP WITDSTDRIP ATSTSSKTIS AGWEPNEENB 240 

DBRDRHLSPS GSGIDDDEDF ISSTISTTPR AFDHTKQNQD VmiWNPSHSN PEVLLQTTTR 300 

MTDVDRNGTT AYBGNHNPEA HPPLIHHEHH EEEETPHSTS TIQATPSSTT EETATQKEQW 360 

FGNRWHEJGYR QTPREDSBST TGTAAASAHT SHPMQGRTTP SPEDSSWTDF ENPISHPMGR 420 

GHQAGRRMDM DSSHSTTMJP TANPNTGLVE DIiDRTGPLSM TTQQSMSQSF STSHEGLEED 480 

KDHPTTSTLT SSNRNDVTGG RRDPNHSEGS TTLLEGYTSK YPKTKESRTF IPVTSAKTOS 540 

ECVTAVTVGD SNSNVNRSLS GDQDTFHPSG GSHTTEGSES DGHSHGSQBG GANTTSGPIR 600 

TPQlPEWt.II LASLLALALI IJWCIAVNSR RRCGQKKKLV INSGKGAVED RKPSGLNGEA 660 
SKSQEMVHLV NKSSSETPDQ PMTADETRNL QNVOMKIGV 



Seq ID NO: 133 ONA sequence 
Nucleic Acid Accession ^t: NM_002882 
coding sequence: 150-755 



1 11 21 31 41 51 

OGAGGTTCGG GTOGTGGGGC GGAGGGAAGA GCQGGOGGGC GGGAGGOGCC GGOSCCAGAC 60 

GOGGAGGGAA GGAGCTACX3A GTAGCCGCCG AGAGGCCGCG GAGCCAGCGA CGACCGACCC 120 

AGCCGAGCOG (XGCCGCCGC CGCGCCCCCA TGGCGGCCGC CAAGGACACT CATGAGGACC 180 

ATGATACTTC CACTGAGAAT ACAGACGAGT CCAACCATGA CCCTCAGTTT GAGCCAATAG 240 

TTTCTCTTCC TGAGCAAGAA ATTAAAACAC TGGAAGAAGA TGAAGAGGAA CTTTTTAAAA 300 

TGCGGGCAAA ACTGTTCOGA TTTGCCTCTG AGAAOGATCT CCCAGAATGG AACGAGCGAG 360 

GCACTGGTGA CGTCAAGCTC CTGAAGCACA AGGAGAAAGG GGCCATCCGC CTCCTCATGC 420 

GGAGGGACAA GACCCTGAAG ATCTGTGOCA ACCACTACAT CACGOCGATG ATGGAGCTGA 480 

AGCCCAACGC AGGTAGCGAC CGTGCCTGGG TCTGGAACAC CCACXX:TOAC TTCGCCGACG 540 

AGTGCCCCAA GCCAGAGCTG CTGGCCATCC GCTTCCTGAA TGCTGAGAAT GCACAGAAAT 600 

TCAAAACAAA GTTTGAAGAA TGCAGGAAAG AGATCGAAGA GAGAGAAAAG AAAGCAGGAT 660 

CAGGCAAAAA IGATCATGCC GAAAAAGTGG CGGAAAAGCT AGA AGCTCT C TCG GTGAAGG 720 

AOGAGACCAA GGAGGATCCT GAGGAGAAGC AATAAATCGT CTTATTTTAT -i-rn- nTTCC 780 

TCTCTTTCCT TTCCTTTTTT TAAAAAATTT TACCCTGCCC CTCTTTTTCG GTTTGTTTTT 840 
ATTCTTTCAT TTTTACAAGG GAOGTTATAT AAAjSAACTGA ACTC 

Seq ID NO: 134 Protein sequence: 
Protein Accession #: NP_002873 

1 11 21 31 41 51 

MAAAKDTHED HDTSTENTDE SNHDPQPBPI VSLPEQEIKT LEEDEBEIiPK MRAiCLFRFAS 60 

ENDLPEMKER GTGDVKLLKH KEKGAIRLLM RRDKTLKICA NHYITPMMEL KFNAGSDRAW 120 

VWNTHAEFAD ECPKPELLAI RFLNAEMAQK FRTKPEECRK EIEEREKKAG SGKHDHABKV 180 
AEKLEALSVK EETREOASEK Q 



Seq ID HO: 135 DNA sequence 

Nucleic Acid Accession NM_O00077.2 

Coding sequence: 277-742 

1 11 21 31 41 51 

I i I i 1 1 

CCCAACCTGG GGCGACTTCA GGTGTGCCAC ATTOGCTAAG TGCTCGGAGT TAATAGCACC 60 
TCCTCCGAGC ACTCGCTCAC GGC3GTC0CCT TGCCTOGAAA GATACC6CGG TCCCTCCAGA 120 
GGATTTGAGG GACAGGGTCG 6AGGGG6CTC TTCCGCCAGC ACCGGAGGAA GAAAGAGGAG 180 
GGGCTGGCTG GTCACCAGAG GGTGGGGCGG ACOGCGTGCG CTCGGCGGCT GCGGAGAGGG 240 
GGAGAGCAGG CAGCGGGCGG CGGGGAGCAG CATGGAGCCG GCGGCGGGGA GCAGCATGGA 300 
GCCTTCGGCT GACTGGCTGG CCACGGOCGC GGCCCGGGGT CGGGTAGAGG AGGTGCGGGC 360 
GCTGCTCGAG GCGGGGGCGC TOCCCAAOGG ACOSAATAGT TACGGTCGGA GGCC6ATCCA 420 
GGTCATGATG ATGGGCAG06 CCCGAOTGGC GGAGCTGCTG CTGCTCCA06 GOGOGGAGOC 480 
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CAACTGCGCC GACCC06CCA CTCTCACCOG AOCOGTGCAC GAOGCTGOOC GGGAGG GCTT 540 

CXrrtSGACAOG CTGGTGGT GC TGCACOGGGC CGGGGOGOQG CIGGAO GTCC GCGATGCCTG 600 

GGGCCGTCTG CCCGTCGACC TGGCTGAGGA GCTGGGCCAT CGOCaVTGTOG CftOGCTACCT 660 

GOGOGOGGCT GCGGGGGGCA CXSGAGGCAG TAACCATGOC OGCATAGATG CCJGOGGAAGG 720 

TCCCTCAGAC ATCCCXS3ATT GAAAGAACCA GAGAGGCTCT GAGAAACCTC GGGAAACTTA 780 

GATCATCAGT CACCGAAGGT CCTACAGGGC CACAACTGCC OOOGCCACaA CCCACCCOGC 840 

■mO G TAGTT TTCATTTAGA AAATAGAGCT TTTAAAAATG TCCTGCCTTT TAAOGTAGAT 900 

ATATGCCrrc CCXX3VCTACC GTAAATOTOC ATTTA TATCft TTTTTTATAT ATTCTT ATAA 960 

AAATGTAAAA AAGAAAAACA CX X;C iTCT GC CTTTTCACPG TGTTGGAGTT TTCIGGAGTG 1020 

AGCACTCAOG CCCTAAGCGC ACATTCATGT GGGCATTTCT TGOGAGCCTC GCAGCCTCOG 1080 

GAAGCTGTOG ACTTCATGAC AAGCATTTTG TGAACTAGGG AAGCTCAGGG GGGTTACTGG 1140 

CTTCTCntSA GTCACACTGC TAGCAAATGG CAGAAOCAAA GCTCAAATAA AAATAAAATA 1200 
ATTTTCATTC ATTCACTC 

Seq ID NO; 136 Protein sequence: 
Protein Accession ft: KP_O00O6B.l 

1 11 21 31 41 51 

MBPAAGSSME PSADWLATAA ASGHVBEVRA I4LEAGALPNA PNSYGRRPIQ VMMMGSARVA 60 

BLUiHGAEP NCADPATLTR PVHDAAREGP LDTLWLHRA QKRUOVBIAV GRLPVDLABE 120 
LGHRnVARYL RAAAGGTRGS MHARIDAAEG PSDIPD 



Seq ID NO: 137 DHA sequence 

MUcleic Acid Accession #: NM_05aX96.l 

Coding sequence: 104-421 



1 11 21 31 41 51 

I 11 i I ' 

TGTGTGGGGG TCTCCTTGGC GGTGAC5GGGG CTCTACACAA GCTTCCTTTC OGTCATGCCG 60 

GCCCCCACCC TCGCTCTGAC CATTCTGrrC TCTCTGGCAG GTCATCAIGA TGGGCftGOSC 120 

CCGAGTGGOG GAGCTGCTGC TGCTOCACGG CGCGGAGOCC AACTGCGCCG ACCCCGCCAC 160 

TCTCACCCGA CCCGTGCAOG ACGCTGCCXG GGAGGGCTTC CPGGAOVOGC TGGTGOTCCT 240 

GCACCGGGCC GGGGCGCX3GC TCGACGTGCG CGATGCCTGG GGCCGTCTGC CCGTGGACCT 300 

GGCTGAGGAG CTGGGGCATC GCGATGTCJGC ACGGTACCTG CGOGOSGCTG CGGGGGGCAC 360 

CAGAGGCAGT AACCATGCCC GCATAGATGC OGOGGAAGGT CCCTCAGACA TCCCCGATTG 420 

AAAGAACCAG AGAGGCTCTG AGAAACCTOG GGAAACTTAG ATCATCAGTC ACCGAAGGTC 4 80 

CTACAGGGCC ACAACT6CCC CCGCCACAAC CCACCGC3GCT TTOGTAGTTT TCATTTAGAA 540 

AATAGAGCTT TTAAAAATGT CCTGCCTTTT AACGTAGATA TAAGCCTTCC CCCACTAOCG 600 

TAAATGTCCA TTTATATCAT TTTTTATATA TTCTTATAAA AATCTAAAAA AGAAAAACAC 660 

CGCTTCTCCC TTTTCACTGT GTTGGAGTTT TCTGGAGTGA GCACTCACGC CCrTAAGOGCA 720 

CATTCATGTG GGCATTTCTT GCGAGCCTOG CAGCCTCCGG AAGCTGTCGA CTTCATGACA 780 

AGCATTTTGT GAACTAGGGA AGCTCAGGGG GGTTACTGGC TTCTCTTGAG TCACACTGCT 840 
AGCAAATGGC AGAACCAAAG CTCAAATAAA AATAAAATAA TTTTCATTCA TTCACTC 



Seq ZD KG: 138 Protein sequence: 
Protein Accession #1 NP_476103.1 

1 11 21 31 41 51 

I 1 I I I i 

MMMGSARVAE UitLHGAEPN CRDPATLTRP VHDAftREGPL DTXjWLHRAG ARIiDVRDAWG 
RI»PVDIiAEEL GHRDVARYIiR AAAGGTRGSN HARIDAAB6P SDIPD 



Seq ID NO: 139 DNA sequence 

Nucleic Acid Accession # : NM_0Sai97.1 

Coding sequence: 272-684 

1 11 21 31 41 51 

I I I 1 I 1 

CCCAACCTGG GGCGACTTCyV GGTGTGCCAC ATTCGCTAAG TGCTCGGAGT TAATAGCACC 60 

TCCTCOGAGC ACT06CTCAC GGCGTCCCCT TGCCTGGAAA GATACCGCGG TCCCTCCAGA 120 

GGATTTGAGG GACAGG6TCX3 GA6GGGGCTC TT0CX3CCAGC ACCGGAGGAA GAAAGAGGAG 180 

GGGCTGGCTG GTCACCAGAG GGTGGGGCGG ACXXSCGTGOG CTCGGCGGCT GCGGAGAGGG 240 

GGAGAGCAGG CAGCGGGOGG CGGGGAGCAG CATGGAGCCG GCGGCGGGGA GCAGCATGGA 300 

GCOGGCGGCG GGGAGCAGCA TGGAGCCTTC GGCTGACTGG CTGGCCACGG OCGCGGCCOG 360 

GGGTOGGGTA GAGGAGGTGC GGGCGCTGCT GGAGGCGGGG GCGCTGCCCA AGGCACCX3AA 420 

TAGTTACGGT CGGAGGCOGA TCXZAGGTGGG TAGAAGGTCT GCAGCGGGAG CAGGGGATGG 480 

G6GG0GACTC TGGAGGAOGA AGTTTGCAGG GGAATTGGAA TCAGGTAGCG CTTCGATTCT 540 

COGGAAAAAG GGGAGGCTTC CTGGGGAGTT TTCAGAAGGG GTTTGTAATC ACAGACCTCC 600 

TCCTG6QGAC GCCCTGGGGG CTTGGGAAAC CAAGGAAGAG GA ATGAGG AG CCAOGOGOCTr 660 

ACAGATCTCT CGAATGCT6A GAAGATCTGA AGGGGGGAAC ATATTTGTAT TAGATGGAAG 720 

TCATGATX3AT GGGCAGOGCC CGAGTGGCGG AGCTGCTGCT GCTCCACGGC GCGGAGCCCA 780 

ACTGCGCCGA CCCCGCCACT CTCACXXS3AC COGTGCACGA CGCTGCCCGG GAGGGCTTCC 840 

TGGACAGGCT GGTGGTGCTG CACCGGGCCG GGG0GCX3GCT GGAOGTGCGC GATGCCTGGG 900 

GCCGTCTGCC CGTGGAGCTG GCTGAGGAGC TGGGCCATCG CGATGTCGCA CGGTACCTGC 960 

GOGOGGCTGC GGGGGGCACC AGAGGCAGTA ACCATGCCOG CATAGATGCC GCGGAftGGTC 1020 

CCTCAGACAT CCCOGATTGA AAGAACCAGA GAGGCTCTGA GAAACCTCGG GAACTT AGAY 1080 

CATCAGTCAC CGAAGGTCCT ACAGGGCCAC AACTGCXICCC GCCACAAOCC ACCCCQCTTT 1140 

CGTAGTTTTC ATTTAGAAAA TAGAGCTTTT AAAAATGTCC TGCCTTTTAA OGTAGATATA 1200 

TGCCTTCCCC CACTACCGTA AATGTCCATT TATATCATTT TTTATATATT CTTATAAAAA 1260 

TGTAAAAAAG AAAAACACCG CTTCTGCCTT TTCACTGTGT TGGAGTTTTC TGGAGTGAGC 1320 

ACTCACGCCC TAAGCGCACA TTCATGTGGG CATTTCTTGC GAGCCTOGCA CCCTCCGGAA 1380 

GCTGTCGACT TCATCACAAG CATTTTGTGA ACTAGGGAAG CTCAGGGGGG TTACTGGCTT 1440 

CTCTTCAGTC ACACTGCTAG CAAATGGCAG AACCAAAOCT CAAATAAAAA TAAAATAATT 1500 
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TTCATTCATT CACTC 

Seq ID KO: 140 Proceic sequence: 
Protein Accessioa #: NP_47ai04.1 

1 11 21 31 41 SI 

I I i 1 I 1 

NEPAAGSSME PAAGSSKEPS ADSfLATAAAR CaVBEVSALL EAGALPiiAPM SYGB&PZQVG 60 
RRSAAGAGDG GRLWRTKFAG BLBSGSASIL RKKGRLPCTF SBGVCKHRPP PGDAWSAWBT 120 

XEEB 

Seq ID NO: 141 DNA sequence 

Hucleic Acid Accession 8: HM^osaiSS.l 

Coding sequence: 163*684 

1 11 21 31 41 51 

) ) 1 ) ) 1 

CCTCCCTACG GGCX3CCTCCG GCAGCCCTTC CCGCGTGCGC AGGGCTCAGA GCCGTTCOGA 60 

GATCTTGGAG GTCCX3GGTGG GAGTGGGGGT GGGGTGGGGG TGGGGGTGAA GGTGGGGGGC 120 

GGGC3GCX;CTC AGGGAAGGCG GGTIGOGOGCC TGOGGGGOGG AGATGGGCAG GGGGCGGTGC 180 

G TOGGTCCCA GTCTGCAjGTT AAGGGGGCAG GAGTGGCGCT GCTCACCTCT GGTGCCAAAG 240 

G60GGCGCAG CGGCTGCCGA GCTCGGCCCT GGAGGOGGOS AGAACATGGT GCGCACGTTC 300 

TTG G TX5ACCC TCCGGATTCG GCGOGCGTGC GGCCCX^CCGC GAGTCAGGCT TTTC GTGGTT 360 

CACATCCCGC GGCTCACGGG GGAGTGGGCA GCGCCAGGGG CGCGCGCCX3C TGTGGCCCTC 420 

GTGCTGATGC TACTGAGGAG CCAGCGTCTA GGGCAGCAGC OGCTTCCTAG AAGACCAGGT 480 

CATGATGATG GGCAGCGCCC GAGTGGCGGA GCTGCTGCTG CTCCACGGCG CGGAGCCC3WV 540 

CTGCGCCGAC CCCGCCACTC TCACCOSACC CGTGCACGAC GCTGCCOGGG AGGGCTTCCT 600 

GGACAOGCTG GTGGTGCTGC ACOGGGCCGG GGCGOOGCTG GACXJTGOGCG ATGC CTGG GG 660 

COSTCTGCCC GTQGACCTGG CTGAGGAGCT GGGCCATOGC GATGTOGCAC GGTACCTGGG 720 

CGCGGCTGCG GGGGGCAOCA GAGGCAGTAA CCATGOCOGC ATAGATGCCG OGGAAGGTCC 780 

CTCAGACATC CCCGATTGAA AGAACCAGAG AGGCTCTGAG AAACCTCGGG AAACTT AGAT 840 

CATCAGTCAC OGAAGGTCXTT ACAGGGCCAC AACTGCCCCC GCC ACAAC CC ACCCCGCTTT 900 

CGTAGTTTTC ATTTAGAAAA TACAGCTTTT AAAAATGTCC TGCCTTTTAA CGTAGATATA 960 

TGCCTTCCCC CACTACCGTA AATGTCCATT TATATCATTT TTTATATATT CTTATAAAAA 1020 

TGTAAAAAAG AAAAACACCG CTTCTGCCTT TTCACTGTGT TGGAGTTTTC TGGAGTGAGC 1080 

ACTCACGCCC TAAGCGCACA TTCATGTGGG CATTTCTTGC GAGCCTCGCA GCCTOCGGAA 1140 

GCTGTCGACT TCATGACAAG CATTTTGTGA ACTAGGGAAG CTCAGGGGG6 TTACTGGCTT 1300 

CTCTTGAGTC ACACTGCTAG CAAATGGCAG AACX»AAGCT CAAATAAAAA TAAAATAATT 1260 
TTCATTCATT CACTC 

Seq ZD NO: 142 Protein sequence; 
Protein Accession 8: NP_478102.1 



1 11 21 31 41 51 

i I I I 1 i 

HGRGRCVGPS LQLRGQEWRC SPLVPKG6AA AAEU3PGGGE NMVRRFLVTL RIRRACGPPR 60 

VRVFWHIPR LTGEWAAPGA PAAVAIiVLML LRSQRLGQQP LPRRPGHDDG QRPSGQAAAA 120 

PRRGAQLRRP RHSHPTRARR CPGGLPGHAG GAAPGRGAAG RARCLGPSAR GPG 

Seq ID NO: 143 DNA sequence 
Nucleic Acid Accession #: im_pi8131 
Coding sequence: 412.. 1107 



1 11 21 31 41 51 

i i I 1 1 I 

GAAATTGCAC ACTTAAAGAC ATCAGTGGAT GAAATCACAA GTGGGAAAGG AAAGCTGACT 60 

GATAAAGAGA GACAGAGACT TTTGGAGAAA ATTCGAGTCC TTGAGGCTGA GAAGGAGAAG 120 

AATGCTTATC AACTCACAGA GAAGGACAAA GAAATACAGC GACTGAGAGA CCAACTGAAG 180 

GdCAGATATA GTACTACOGC ATTGCTTGAA CAGCTGGAAG AGACAACGAG AGAAGGAGAA 240 

AGGAGGGAGC AGGTGTTGAA AGCCTTATCT GAAGAGAAAG ACGTATTGAA ACAACA GTTG 300 

TCTGCTGCAA CCTCACGAAT TGCTGAACTT GAAAGCAAAA CCAATACACT CCGTTTATCA 360 

CAGACTGTGG CTCCAAACTG CTTCAACTCA TCAATAAATA ATATTCATGA AATG6AAATA 420 

CAGCTGAAAG ATGCTCTGGA GAAAAATCAG CAGTGGCTCG TGTATGATCA GCAGCGGGAA 480 

GTCTATGTAA AAGGACTTTT AGCAAAGATC TTTGAGTTGG AAAAGAAAAC GGAAACAGCT S40 

GCTCATTCAC TCCCACAGCA GACAAAAAAG CCTGAATCAG AAGGTTATCT TCAAGAAGAG 600 

AAGCAGAAAT GTTACAAOSA TCTCTTQGCA AGTGCAAAAA AAGATCTTGA GGTTGAAOGA 660 

CAAACCATAA CTCAGCTGAG TTTTGAACTG AGTGAATTTC GAAGAAAATA TGAAGAAACC 720 

CAAAAAGAAG TTCACAATTT AAATCAGCTG TTGTATTCAC AAAGAAGGGC AGATGTGCAA 780 

CATCTCGAAG ATGATAGGCA TAAAACAGAG AAGATACAAA AACTCAGGGA AGAGAATGAT 840 

ATTGCTAGGG GAAAACTTGA AGAAGACAAG AAGAGATCCG AAGAGCTCTT ATCT CAGGTC 900 

CAGTCTCTTT ACACATCTCT GCTAAAGCAG CAAGAAGAAC AAACAAGGGT AGCTCTGTTG 960 

GAACAACAGA TGCAGGCATG TACTTTAGAC TTTGAAAATG AAAAACTCGA CCGTCAACAT 1020 

GTGCAGCATC AATTGCATGT AATTCTTAAG GAGCTCCGAA AAGCAAGAAA AAA TAACA CA 1080 

GTTGGAATCC TTGAAACAGC TTCATGAGTT TGCCATCACA GAGCCATTAG TCACTTTOCA 1140 

AGGAGAGACT GAAAACAGAG AAAAAGTTGC GGCCTCAOCA AAAAGTCCCA CTGCTGCACT 1200 

CAATGGAAGC CTGGTGGAAT GTCCCAAGTG CAATATACAG TATCCAGCCA C TGAGC ATCG 1260 

CGATCTGCTT GTCCATGTGG AATACTGTTC AAAGTAGCAA AATAAGTATT TGTTTTGATA 1320 

TTAAAAGATT CAATACTGTA TTTTCTGTTA GCTTGTGGGC ATTTTGAATT ATATATTTCA 1380 

CATTTTGCAT AAAACTGCCT ATCTACCTTT GACACTCCAG CATGCTAGTG AATCATGTAT 1440 

CTTTTAGGCT GCTGTGCATT TCTCTTGGCA GTGATACCTC CCTGACATGG TTCATCATCA 1500 

GGCTGCAATG ACAGAATGTG GTGAGCAGOG TCTACTGAGA TACTAACATT TTGCACTGTC 1560 

AAAATACTTG GTGAGGAAAA GATAGCTCAG GTTATTGCTA ATGGGTTAAT GCACCAGCAA 1620 

GCAAAATATT TTATGTTTCG GGGGTTTTGA AAAATCAAAG ATAATTAACC AAGG ATCTTA 1680 

ACTGTGTTCG CATTTTTTAT CCAAGCACTT AGAAAACCTA CAATCCTAAT TTTGATGrCC 1740 

ATTGTTAAGA GGTCGTGATA GATACTATTT TTTTTTCATA TTGTATACOG GTTATTAGAA 1800 
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AAGTTGGGGA TTTTCTTGAT CTTTATTGCT 
TCCCCAACTC TGTTCTGCGC AOGAAACA£ST 
O^CACAATGT TTTCTCTTAT GTTATCTGGC 
TTCroCTTAG CTAAAATTGT TAAAATAAAC 
ATTGACAGTA TTTTAGTTAT TTTTGGCATT 
TTTGTTTCTC TGAACAGGTA TTTTTATACA 
TCTTCAGGTT TTCTAACATG CTTACCACTG 
ATTTAATGTT 7T 



GCTTACCATT GAAACTTAAC (XACCTGTGT 1860 

ATCIGTTTGA GGCATAATCT TAA6S0G0CA 1920 

AGtAACTGTA ACTPGAATTA CATTAG CACA 1980 

TTTAATAAAC CCATGTAGCC CTCTCATTTG 2040 

CTTAAAGCTG GGCAATGTAA TGATC AGATC 2100 

TGCTTTTTGT AAAOCRAAAA CTTTTAAATT 2160 

GGCTACTGTA AA.T6AGAAAA GAATAAAA.TT 2220 



Seq ID KO: 144 Protein sequence: 
Protein Accession 8: NP_06060i 



1 11 21 

1 I I 

MHIQIiKDhLE KSOQWIiVYDQ QREVYViQGIiL 
QESKQKCVND LIiASAKKDIjB VERQTITQLS 
DVQHLEDDRH ICTEKIQKLRE ENDIARGKLB 
ALLEQQMQAC TliDFENSKLD RQHVQHQLHV 



31 41 SI 

i I I 

AKIFBLEKKT ETAAHSl*PQQ TKKPESEGYL 
FBIiSEFiUUCy EETQKEVHML NQLLYSQRRA 
EEKXRSEELli SQVQSLYTSL UCQQBEQTRV 
ILKELRKARK NmVGIbETA S 



Seq ID NO: 145 DNA sequence 
Nucleic Acid Accession Ss NM_001168 
Coding sequence: 50.. 478 



1 11 21 31 41 51 

COGCCRGATT TGAATOGOGG GACCCGTTGG CAGAGGTGGC GGCX5GCGGCA TGGGTGCCCC 60 

GAOGrreCCC CCTGCCTGCSC AGCOCTTTCT CAAGGACCAC CX3CATCTCTA CATTCAAGAA 120 

CTGGCCCrrC TTGGAGGGCT GCGCCTGCAC CCCGGAGCGG ATGGCOGAGG CTGGCTTCAT 180 

CCACTCCCCC ACTGAGAACG AGCCAGACTT GGCCCAGTGT TTCTTCTGCT TCAAGGAGCT 240 

GGAAGGCTGG GAGCCAGATG ACGACCCCAT AGAGGAACAT AAAAAGCATT OGTCCGGTTG 300 

OQcrrrccTT TCTGTCAAGA AGCAGTTTGA AGAATTAACC CTTGGTGAAT TTTTGAAACT 360 

GGACAGAGAA AGAGCCAAGA ACAAAATTGC AAACGAAACC AACAATAAGA AGAAAGAATT 420 

TGAGGAAACT GCGAAGAAAG TGCGCOGTGC CATCGAGCAG CTGGCTGCCA TGGATTGAGG 480 

CCTCTGGCX:G GAGCTGCCTG GTCCCAGAGT QGCTGCACCA CTTCCAGGGT TTATTCCXrrG 540 

GTGCCACCAG CCTTCCTGTG GGCCCCTTAG CAATGTCTTA GGAAAGGAGA TCAACATTTT 600 

CAAATTAGAT GTTTCAACTG TGCTCCTGTT TTGTCTTGAA AGTGGCACCA GAGGTGCTTC 660 

TGCCTGTGCA GCGGGTGCTG CTGGTAACAG TCGCTGCTTC TCTCTCrCTC TCTCTTTTTT 720 

GGGGGCTCAT TTTTGCTGTT TTGATTCCCG GGCTTACCAG GTGAGAAGTG AGGGAGGAAG 780 

AAGGCAGTGT CCCTTTTGCT AGAGCTGACA 6CTTTGTTCG CGTGGGCACA GCCITCCACA 840 

GTGAATGTGT CTGGACCTCA TGTTGTTGAG GCTGTCACAG TCCrGAGTGT GGACTTGGCA 900 

GGTGCCTGTT GAATCTGAGC- TGCAGGTICC TTATCTGTCA CACCTGTGCC TCCTCAGAGG 960 

ACAGrrrTTT TGTTGTTGTG TTTTTTTGTT TTTTTTTTTT QC5TAGATGCA TGAC TTGT GT 1020 

GTGATGAGAG AATGGAGACA GAGTCCCTGG CTCCTCTACT GTTTAACAAC ATOGCTTTCT 1080 

TATTTTGTTT GAATTGTTAA TTCACAGAAT AGCACAAACT ACAATTAAAA CTAAGCACAA 1140 

AGCCATTCTA AGTCATTGGG GAAACGGGGT GAACTTCAGG TGGATGAGGA GACAGAATAG 1200 

AGTGATAGGA AGCGTCTGGC AGATACTCCT TTTGCCACTG CTGTGTGATT AGACAGGCCC 1260 

AGTGAGCCGC GGGGCACATG CTGGCCGCTC CTOCCTCAGA AAAAGOCAGT GGCCTAAATC 1320 

CTTTTTAAAT GACTTGGCTC GATGCTGTGG GGGACTGGCr GGGCTGCTGC AGGCCGTCTG 1380 

TCTGTCAGCC CAACCTTCAC ATCTGTCACG TTCTCCACAC GGGGGAGAGA OGCAGTCOGC 1440 

CCAGGTCCCC GCTTTCTTTG GAGGCAGCAG CTCCCGCAGG GCTGAAGTCT GGOGTAAGAT 1500 

GATGGATTTG ATTOGCCCTC CTOCCTGTCA TAGAGCTGCA GGGTGGATTG TTACAGCTTC 1560 
GCTGGAAACC TCTGGAGOTC ATCTOGGCTG TTCCTGAGAA ATAAAAAGCC TGTCATTTC 



seq 10 NO: 146 Protein sequence: 
Protein Accession if: NP_0011S9 



1 11 21 31 41 51 

) 1 I I 1 i 

MGAPTLPPAW QPPLKDHRIS TFKSWPPLEG CACTPERMAE AGPIHCPTEN EPDLAQCPPC 60 
FKELBGWEPD DDPIEEHKKH SSGCAPLSVK KQPEBLTLGB PUCLOREBAK NKIAKETNMK 120 
KXSFEETAKK VRRAIEQLAA MD 



Seq ID NO: 147 DNA sequence 

Nucleic Acid Accession fts im_014176.1 

Coding sequence: 127-720 

X 11 21 31 41 51 ' 

I I I I 1 1 

GCGCCCAGOS CTGGTACCCC GTTGGTCCGC GOGTTGCTGC GTTGTGAGGG GTGTCAGCTC 60 

AGTGCATCCC AGGCAGCTCT TAGTGTGGAG CAGTGAACTG TGTGTGGTTC CTTCTACTTG 120 

GGGATCATGC AGAGAGCTTC ACGTCTGAAG AGAGAGCTGC ACATGTTAGC CACAGAGCXA 180 

CGCCCAGQCA TCACATGTTG GCAAGATAAA GACCAAATGG ATGACCTGCG A6CTCAAATA 240 

TTAGGTGGAG CCAACACACC TTATGAGAAA GGTGTTTTTA AGCTAGAAGT TATCATTCCT 300 

GAGAQGTACC CATTTGAACC TCCTCAGATC OGATTTCTCA CTCCAATTTA TCATCCAAAC 360 

ATTGATTCTG CTGGAAGGAT TTGTCTGGAT GTTCTCAAAT TGCCACCAAA AGGTGCTTGG 420 

ACACCATCCC TCAACATCGC AACTGTGTTG ACCTCTATTC AGCTGCTCAT GTCAGAACCC 480 

AACCCT6ATG ACCOGCTCAT GGCTGACATA TCCTCAGAAT TTAAATATAA TAAGCCAGCC 540 

TTCCTCAAGA ATGCCAGACA GTGGACAGAG AAGCAT6CAA GAGAGAAACA AAAGGCTGAT 600 

GAGGAAGAGA TGCTTGATAA TCTACCAGAG GCrGGTGACC 0CA6AGTACA CAACTCAACA 660 

CAGAAAAGGA AGGCCAGTCA GCXAGZAGGC ATAGAAAAGA AATTTCATOC 1GATGTTTAG 720 

GGGACrrCTC CTGGTTCATC TTAGTTAAIG TGTTCTTTGC CAAGGTGATC TAAGTTGCCT 780 

AGCTTGAATt TTTTTTTAAA TATATTTGRT GACATAATTT TTGTGTAGTT TATTTATCTT 840 

GTACATATGX ATTTTGAAAT CTTTTAAACC TGAAAAATAA ATAGTCATTT AATGTTGAAA 900 
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AAAAAAAAAA AAAAAAAAAA AAMAMA 

Seq ID MO: 148 crotein -sequence: 
Protein Accession 8: NP_05489S.l 

1 11 21 31 41 SI 

I I I I I I 

MQRASRLK8E I£MLATEPPP 6ITCR(^KCQ MDDUIAQILG GANTPYEKGV FKLEVIIPER 
YPFEPPQZRF LTPIYBPHID SAGSICUTVL fO^PPKCSHRP SLNIATVLTS IQLLMSEPKP 
DDPIMADISS EFKYNKPAFL KNAROWTEKB ARQKQKADEE EMZiDHLPEAG DSRVBNSTQK 
RKASQLVGZB XKFHFOV 

Seg ID KO: 149 DNA sequence 
Nucleic Acid Accession #: NM_003812 
Coding sequence: 224-2722 



PCTAJS02/12476 



TCCTCTGCGT 
GCCCAGCCCC 
CCATGCG06C 
GCTCTOGCCG 
CAGCTOGOGG 
ACGCGGCCCC 
GCTTCTCGTC 
GGCTGCXGG6 
GGCAGATGAA 
AATGCAGAAA 
AAGCCCTTAT 
CCATCTGGCC 
CATACTGAAC 
ACCACAGTAC 
AGACTOCAAG 
CTTOGTCTAT 
ACATATAATC 
GGAAAGAGGT 
AGCAGTGAAT 
TAATGATCAC 
AAAGTCCGTG 
CCTGGTOGCT 
' GCAGATGCTC 
GCACCTCATC 
TGTCTGTTCT 
ACAAGTATTA 
AAAGCCAAAA 
GTCCCATTCT 
AGGAGGTGOA 
AAATGGATAC 
ATTATGCTGT 
TAACAATACC 
GTGTGATATT 
GCAAGACGGA 
CAGAGACAAC 
CTATGAAAAG 
GTGGATTCAG 
TCX3AGCTCCA 
AGGCCX3GGTG 
CTATGTAGAA 
ACAAATTCAA 
GGGCCATGGG 
AGATTGCAGT 
GGGTCCTAGT 
TATTGTCCTT 
TACTCAGCAA 
ATTCTGGGTA 
CTTTGGGTGG 
CTGTCTCTTT 
GGGGGCAAAA 
AGGAAGGAAC 



11 
I 

CCCGCCCOGG 
GAGCCCCGOG 
0GAGC06G0G 
GCCACACGGA 
CAGCXX3COCC 
GCCGGCTOGG 
CTTCTCCTGC 

cx:cagcgctc 
gacaatacat 
gaaatcacac 
cacgttcttg 
caggcaagct 

AATGGTTTGT 
TCTAACGGTG 
GTGGCTCTGT 
ATGATAGAGC 
CAGAAAACCT 
GACCAGTGGC 
CXy^TCACGTG 
AAAACGTATA 
GTCAACCTTG 
GTAGAGACCT 
CATGAGTTCT 
TCGCGGGTGA 
CGCACAAGAG 
TCGCAGAGCC 
TGTGACTGCA 
CGAAAATTTT 
GCCTGCCTTT 
GTGGAAGCTG 
AAGAAATGTT 
TCATGTCTTT 
ACTGAATATT 
TATGCATGCA 
CAGTGTCAGT 
CTGAATACA6 
TGCAGCAAAC 
CGTATTGGTC 
ATTGACTGCA 
GATGGAACGC 
GCCCTAAATA 
GTGTGTAGTA 
ATCCGGGATC 
GCCACCAATC 
GGGGGCACAG 
GGCCCCATCT 
TGACATACTC 
TAATGACTAC 
TGGAAATAAT 
GACCATGCTA 
AACACACACA 



21 
I 

GAGTGGCTGC 
CCCCX3TGCCC 
TGACOGGCrC 
GGGGC6CC0G 
TGG0GG6CTG 
TGCCTGCCAG 
TGCCTCCCCT 
CGCATTGGAA 
TGCAACAGAA 
TGCCTTCAAG 
ACACAAAGGC 
TCCAGATTGA 
TGTCTTCTGA 
GAGAGCACTG 
CAACCTGCAA 
CACTAGAGCT 
TGGCAGGACA 
CCTTTCTCTC 
GTATATTTGA 
AGAAGCATCG 
TGGATTCTAT 
GGACTGAGAA 
CAAAATACCG 
CATTTCACTA 
GAGTTGGTGT 
TGGCTCAAAA 
CAGAATCCTG 
CAAAGTGCAG 
TCAACAGGCC 
GGGAGGAGTG 
CCCTCTCCRA 
TTCAGCCAOS 
GTACTGGAGA 
ATCAAAATCA 
ACATCTGGGG 
AAGGCACTGA 
AT6ATGTGTT 
AACTTCAGGG 
GTGGTGCCCA 
CATGTGGCCC 
TGAGCAGCTG 
ATGAAGCCAC 
CAGTTAGGAA 
TCATAATAGG 
GCTGGGGATT 
GAATCAGCTG 
GCAGCAGTGT 
GGAGCTAAAG 
6TCAAAGAAC 
TAAAAAGAAC 
CAAAAATTAA 



31 
I 

GAGGCTAGGC 
CGAGCCCGGA 
OGOC06CXX3C 
GGAGCTArGA 
CAGOCTTGCC 
CGCCCOGGCX: 

CGcascxrrcG 

TGAAACTGCA 
TAGCAGCAGT 
ACTCATATAT 
AAGACAGCAG 
AGOCTTOGGC 
TTATGTGQAG 
TTACTACCAT 
TGGACTTCAT 
(K3TTCATGAT 
GTATTCTAAG 
TGAATTACAG 
AGAAATGAAA 
CTCTTCTCAT 
TTACAAGGAG 
GGATCAGATT 
GCAGOGCATT 
TAAGAGAAGC 
GAATGAGTAT 
CCTTGGAATC 
GGGTGGCTGC 
CATTTTGGAG 
AACAAAGCTA 
TGATTGTGGT 
CGGGGCTCAC 
AGGGTATGAA 
CTCTGGTCAG 
GGGCCGCTGC 
AACAAAGGCT 
6AAGGGAAAC 
CTGTGGATTC 
TGAGATCATT 
TGTAGTTTTA 
GTCTATGATG 
TCCACTCGAT 
CTGCATTTOT 
CCTTCACCCC 
CTCCATOGCT 
TAAAAATGTC 
CGCTGGATGG 
TACTGGAACT 
TTGGGGTGAC 
ACCTTTCACC 
TGITCCftGAA 
ATGCAATAAA 



41 

1 

GAGCCGGCAA 
GCCCCCTGCC 
OGCCCOGCAG 
GCCATGAAGC 
GGOGCTTCCT 
CGCACGCCGC 
TCCCGGCCCC 
GAAAAAAATT 
AATATCAGTT 
TACATOUICC 
CAAAAACATA 
TCCAAATTCA 
ATTCACTACG 
GGAAGCATCA 
GGCATGTTTG 
GAGAAAAGCA 
CAAATGAAGA 
TGGTTGAAAA 
TATTTGGAAC 
GCACATACCA 
CAGCTCAACA 
GACATCACCA 
AAGCAGCATG 
AGTCTGAGTT 
GGTCTTCCAA 
CAATGGGAAC 
ATCATGGAGG 
TATAGAGACT 
TTTGAGCCCA 
TTTCATGTGG 
TGCAGGGACG 
TGCCGGGATG 
TGCCCACCAA 
TACAATGGCG 
GCAGGGTCTG 
T6GGGGAAGG 
TTACTCTGTA 
CCAACTTCCT 
GATGATGATA 
TGTTTAGATC 
TCCAAGGGTA 
GATTTGACCT 
OCCftAGGATG 
GGTGCCATCC 
AAGAAGAGAA 
ACACCGCCTT 
ATTAAGTTTG 
AAGGATGGGG 
ACCTGTCAGT 
TCrrTTTTTT 
GGAATCATTA 



Seq 10 MO: ISO Protein sequence: 
Protein Accession ft: NP_003803 



I 

MKPPGSSSRQ 
RPRAMGAAAP 
INQDSESPYH 
HYENGKPQYS 
KSTGRPHZZQ 
LELMIVNDHIC 
ITTNPVQMLH 
LPKAVAQVLS 
RDFLQRGGGA 
SDGPCOQITS 
MGECKTBU3NQ 



11 
1 

PPLAGCSLAG 
SAPHWNETAE 
VLDTKARHQQ 
KGGEHCYYHG 
XTLAGQYSKQ 
TYKKHRSSEA 
EFSKYRQRIK 
QSLAQNLGIQ 
CLFNRPTKLF 
CLFQPRGYEC 
OQYIWGTKAA 



21 

I 

ASCGPQRGPA 
KNJjGVLADED 
KHNKAVHLAQ 
SZRGVKDSXV 
MKNLTMERGD 
HTNNFAKSW 
QHADAVHLIS 
WEPSSRKPKC 
BPTECQIGYV 
SDAVNECDZT 
GSDKFCYEKL 



31 

I 

GSVPASAPAR 
NTliQQNSSSN 
ASFQIEAFGS 
ALSTCNGLHG 
QHPPZiSELQW 
NLVDSIYKEQ 
RVTPHYKRSS 
DCTESWGGCI 
EAGEECDCX3F 
SYCTGDSGQC 
NTEGTEKGNC 



41 

I 

TPPC31LLLVL 
ISYSNAMQKE 
KFILDLILNN 
MFEDDTFVYM 
ZiKRRXRAVNP 
IiNTRWIiVAV 
LSYFGGVCSR 
MEETGVSHSR 
HVECYGLCCK 
PPNLHXQDGY 
GKDGOSWXQC 



60 
120 
180 



SI 
1 

AGGGGGCGCC 
CGCGGCGGCA 
CTAGCCCGGC 
CGCCOGGCAG 
GCGGCCCCCA 
CCTGCCGOCT 
GCGCCTGGGG 
TGGGAGTCCT 
ACAGCAATGC 
AAGACTOGGA 
ATAAGGCTGT 
TTCTTGAOCT 
AAAATGGGAA 
GAGGCGTCAA 
AAGATGATAC 
CAGGTCGACC 
ATCTCACTAT 
GAAGGAAGAG 
TTATGATTGT 
ACAACTTTGC 
CCAGGGTTGT 
CCAACCCTGT 
CTGATGCTGT 
ACTTTGGAGG 
TGGCAGTGGC 
CTTCTAGCAG 
AAACAGGGGT 
TTTTACAGAG 
CGGAATGTGG 
AATGCTATGG 
GGCCCTGCTG 
CTGTGAAOGA 
ATCTTCATAA 
AGTGCAAGAC 
ACAAGTTCTG 
A7GGAGACCG 
CCAATCTTAC 
TCTACCATCA 
CGGATGTGGG 
GGAAGTGCCT 

aagtctgttc 
gggcagggac 
aaggacccaa 
tggtagcagc 

GGTTCGATCC 
GCACTGTTGG 
TAAACAAAAC 
TAAAAGAAAA 
AAACGGGGGA 
TCCCTAATGG 
AAAA 



51 

1 

LLLPPLAASS 
ITLPSRLIYY 
GLLSSDYVEI 
lEPLELVHDE 
SRGZFEQOCY 
ETWTEKDQZD 
TRGVGVNEYG 
KFSKCSILEY 
RCSLSNGAHC 
AOJONQGRCY 
SKHDVFOGFL 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
9O0 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2S8Q 
2640 
2700 
2760 
2820 
2880 
2940 
3000 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
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WO 02/086443 

IiCTKLTRAPa XGQLQGEIIP TSPVHQGaVI DCSGAHWLD DDTDVGYVED GTPCG?SKMC 
MRKCLQIQA IMISSCPU>S KGKVCSGHGV CSUEATCICD FTWftGTDCSI HDPVHNIBPP 
KDE6PXGPSA 1NLIIGSIA6 AILVAAIVLG GT6UGF1QIVK KRSFDPTQQG PI 

Seq ZD NO: 151 DHA sequence 
Nucleic Acid Accession g : KM_02391S 
Coding sequence: 250-1326 



720 
780 



PCTAJS02/12476 



1 
1 

GGCAOGAGGG 
TCAAAGCTTA 
GTGAATGGAC 
CCCAOGCCTC 
AACTGAAGAA 
CAAGAGAGTC 
AATGAATTTG 
TTGCTGAATG 
TTCTATCTCA 
ATAGTCO^TG 
TCAGTTTTGT 
GATCGCTATC 
AOGAAGGTTT 

atcctgacaa 
cctttggggg 
gtgctggtga 
aggcaattca 

GTGGCTGTGT 
AGTCACTTAG 
ATTACACTTT 
TGTAOGTCAT 
ATCAGATCAC 
GTGTAGGCCT 
TTCATTATCC 



11 
I 

TTCTTAATTA 
AGCCAGCCAC 
AA'rCGTCCCC 
TGGGGTTCAA 
ACAATTCAGG 
ACACAATTGT 
GTTTAlGCAGT 
AAAACATAGT 
ATGCAGGATT 
TTTATGCAAA 
TGAAGGTGGT 
TATCTGTTTG 
ATGGTCAGCC 
TCAAATGGCA 
TTCTGATCGG 
TAAGTCAGTC 
TTTTTACCTG 
ACAGGCTTTT 
TCTTGTCTGC 
TTTCAAGAAG 
TGCAAAGTGT 
TTTATTGTTT 
TTAAAAAAAA 



21 

I 

ATGCTTTACC 
GAGACAAGAA 
CACAAIGAAA 
AAGTGTTTCC 
CTTGACGCTT 
CAACAGGAGC 
CTTGCOGGTG 
CrrGGATCTTC 
GGTTGCAGAC 
TGGACCTTGG 
CATGTATACT 
CAAGCCATTT 
TGTTTGGGTG 
AACAGAGGAC 
TACGGCAGTC 
ATGTTACATA 
AAGCGGAAAG 
CTTTCTACCA 
AGATGAATCT 
GTGTAATGTT 
GCTGTTCAAA 
GAGAAGATCG 
GTTGGAATGG 
AA 



31 
I 

AGAAAATCCA 
ACCTGTTTCA 
GAAATCAAAC 
TGACAO^CAT 
GCAAAATTAC 
GAGGGGCCAG 
CTTTATCTCA 
TTCCACATTA 
CTCATAATGA 
TACTTCAAGT 
TOCATOGTGT 
G6GGACTCTC 
ATCATGGCTG 
AATATCXZATG 
ACCTATGTGA 
GCCATATCCA 
OGAAAACATA 
TATCACTTGT 
GCACAAAAAA 
TGCCTGGATC 
AAATCAAATA 
GAAGTTOSCA 
ATATGTACAA 



41 
I 

CTTCCCTGCC 
ACTTGAAGAC 
CAGGAATAAC 
CTTTGCTTAC 
CAAATAAOGA 
GAAAGAACAC 
TTATATTTGT 
GGAATAAAAC 
GGCTGACATT 
TTATTCTCTG 
TCCTTGGGCT 
GGATGTACAG 
TTTTGTCTTT 
ACTGCTCAAA 
ACAGCTGCTT 
GGTACATCCA 
ACCAGAGCAT 
GCAGAATTCC 
TCCTATATTA 
CAATAATTTA 
TCAGAACCAG 
TAIATTATGA 
AGTGTAAATA 



51 
I 

GACCTTAGTT 
ACCXTTATGAG 
CTATGCTGAA 
AGTGCATCAC 
GC TGCAOGG C 
CACCCTTCAC 
GGCAAGCATC 
CAGCTTCATA 
TCCATTTCGA 
CAGATACACT 
GATAAGCATT 
CATAACCTTC 
GCCAAACATC 
ACTTAAAAGT 
GTTTGTGGCC 
CAAATCCAGC 
CAGGGTTGTT 
TTTTACTTTT 
CFGCAAAGAA 
CTTTTTCATG 
GAGTGAAAGC 
TTACACTGAT 
AATGTTTCTT 



Seq ID KO: 152 Protein sequence: 
protein Accession #: NP_076404 



MGFNLTLAKL 
GLAVWIFFHI 
FYANWrrSIV 
NGQPTEDHIH 
ZSQSSRKRKH 
FLSACNVCLD 



11 
I 

PNNELHGQES 
RNKTSFIPVI. 
FLGLISIDRY 
DCSKLKSPIXS 
NQSIRVWAV 
PIIYFFMCRS 



21 

I 

KNSGNRSDGP 
KNIWADLIM 
LKWKPPGDS 
VKWHTAVTYV 
PFTCPUPYHL 
FSRRLFKKSN 



31 
I 

GKNTTLHNEF 
TLTPPPRIVH 
RMYSITFTKV 
NSCLFVAVLV 
CRIPFTPSHI* 
IRTRSESIRS 



41 

} 

DrXVLPVLYL 
DAGFGPWYPK 
LSVCVWVIMA 
ILIGCYIAIS 
DRLLDESAQK 
LQSVRRSEVR 



51 

i 

ZXFVASILLN 
PILCRYTSVL 
VliSLPNIILT 
RYIHKSSRQF 
ILyyCKEITL 
lYYDVTDV 



Seq ID NO; 153 DNA sequence 
NXicleic Acid Accession H: D80008.1 
Coding sequence: 149-739 



1 

1 

GTTCGGCGCC 
CGAAAGGAGT 
AAGGCCGCGG 
OGAGCTGCAT 
AGTTCTGGAG 
GTCAGGTGGA 
AAATCGACX3C 
ATGGGAATAT 
GGAGTGGTTT 
TGAAGGTTTG 
GTGTCTAAAA 
AAATAGCCAG 
GGAGCACATC 
CTCCTCTGTA 
TAGACATTGT 
AGGACTTTCT 
GTTTTGTAGA 
AGTGCTCCCA 
CCCCTACTCC 
GTGTGTTTTT 
TTQGCTGGAC 
CAAGCTAGAG 
TGGTCTGTAG 
TATTTGGGAA 
CTTGTGGCTA 
CTAGAGAAGG 
AGAGTTGATT 
TCCAGTTTAT 
TCCCAAGATC 
TACTTTGGTC 
GTTTTGAGAT 
CACTGCAATC 
GGGATTACAG 



11 

I 

AAAGCGCGGA 
GAGGCGCCGA 
GAGTGGGAAG 
CGCGCGCCCG 
GAGATGAAAG 
OGAAGTGATT 
TGCACTGTAG 
GGTAGCGTCT 
AATAATTATA 
GACATTACAC 
GACTATGGAG 
CACTTTTTAC 
CTGTCATGAC 
CTCACTCTCT 
TTAAGATAAC 
TTTTTTAATG 
GACTGTCTCA 
CCTTAGCTTC 
TTTTTCTAAT 
TAAATGAAAG 
AGGAAGAAGG 
AGCTGAATTT 
AAATTTTCAG 
GGAAGGACAC 
TGGGGTGATC 
AACTTTGTAC 
GTCTTTTAAT 
TQGTTTGTTC 
ACAATTTTTT 

tatgacccgt 
ggagtcttgt 
tctatcccct 
gcacaggc06 



21 

1 

GCGGAGGCCG 
GAGCCCAGAT 
CGTCCGCCAT 
AAGGGCAACr 
CTTTGTAT6A 
TGATACCAAC 
CATACCTGTA 
TGCCAAATGC 
AAAGATCTCT 
AGGATATGAA 
AATTTGAAGT 
CTCGATGGAA 
CATGCGCC3GA 
CCACCACTCC 
TAAGAATACT 
TTGTACACTA 
CTATGTTGCC 
TCAAAGTGTT 
AAGCTGTATC 
TAAACATGGT 
TAGATCCTGT 
CTGAGATACA 
TATATATAAT 
ACATGGATTT 
ACCA6TATCA 
AGTTTTCCCT 
GGTATCTTTT 
TTTTATGCTT 
TTCCTTTTTA 
TTTTTTTTTT 
TCTOTCACCC 
OGGTTCAAGT 
CCACGCCTGG 



31 

1 

AGGCGAGAGC 
ACCATTTTGG 
GTTCTGOGAA 
GCCTGCCTTC 
ACAAAACCAG 
TATOUUITTT 
TGACCGCTTG 
ATTACXSATTT 
TGCTACTTAT 
ACCACCAAAA 
TGATGATGGC 
ATGTGAGCAG 
GGCACTTCCA 
CTTCACCTCC 
TGGCTAAGAA 
TTCTTCCTAC 
CAAGCTGGTC 
GAGATCACAG 
TGTAATCACA 
TACATTTGAA 
GTGTCTTGTT 
CATTTTCAAA 
GTTTAATGAC 
TGCACATTTC 
CCACTTTGGA 
GAGATTCAGA 
AAACAGCTGA 
TGGGTGTTGC 
CTTCTAGAAG 
GTTTTGTTTT 
AGGCTGGGGT 
6ATTCTCIT6 
CTAATTTTTG 



41 
I 

CTGGCGCTGT 
CGTGAGAGCr 
AAAGCCATGG 
AACX3AGGATG 
TCTGATGTGA 
CGACACTGTT 
CTTCGGATCA 
CACATGGCTG 
ATGAGGTCAC 
AGCCTATATA 

acttcagtcc 
ctgatcagac 
ggcttcactc 

CTCTTTGATT 
GTATAATTTG 
TCTTTTTTGG 
TCAAACTCCT 
GCGTGAGCCA 
GCATTCCTAC 
TCTCTTAAAT 
TTCTGGTCAT 
TCACATGCAA 
ATACTAATTT 
CACCATGGTG 
AGGOGACAGT 
TTGACTGAAA 
CATTTTAAAT 
ATO06AGAAA 
TGTTATAATT 

G'r m Ti'OGr 

GCAGTGGCGT 
TCTCAGCCTC 
TATTTTIAlGT 



60 

120 
180 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1360 



60 
120 
180 
240 
300 



51 
1 

AGGACTAGAA 
GGTGGTTGGC 
AACTGATCCG 
GACTCAGACA 
ATGAAGCAAA 
CTCTGTTAAG 
GA6CACTCAG 
CTGAAGAAAT 
TGGGAGGAGA 
TTGAAGTCOG 
TATTAAAAAA 
AAGGAGTCCT 
AACTCATGGA 
TTA6AAGCTA 
CTAACTATTA 
TTTTGGTTTT 
GGCCTCAAGC 
CTGCACCCGG 
AGTTGTTACA 
AAGCAGTCAC 
GTGTATTGTA 
GTGAAGATGA 
ATCATCTGGC 
GCTGGTGTGG 
GAAATTGGGG 
AGTCACATGA 
TTTGATGAAA 
TCTTTTCCCA 
TTAAGCTTTA 
TTGTTTCTTT 
GATCTTGGCT 
CCAAGTAGCT 
AGAGACAGAG 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
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WO 02/086443 

TTTTACCATG TTGGCCAGGC 
CCCAAAGTTT TGGGRTTACA 
GAATTTTTTA TATGGTGCAA 
TCCAGCTGTT 7CACTACCAT 
TTTGTTAAAA AGTAGTTGTC 
ATTGACCTGT TTTTCTCTCC 
TCTAATAATT CTTGAAACAG 
TTTGTAGAGA TGGGGTTTCA 
ATACACTTGC CTCGT CCTCC 
CAGTGTACCA GATTTCTTTT 
GTGAAATTTG GGAACAGGCA 
GGCCTAGATG GGTGGATCAC 
AACTOCGTCT CTACAAAAAA 
CACAGTTACA CGGCAfiGCTG 
GTGAGCTGAG ATCACACCAC 
AAAGAAATTA GGATCAATTT 
CACCTTGATT GAGATTGCAT 
AATATTGAGT CTTCTGGOCT 
TCTATTTCTC TTAATAATCT 
CATACfTTTTG ATGCTAAATG 
AATAGAAATA CAATTGATGT 
ATGGTGTTTT TGTAAATTAC 
TTC 



Seq ID NO: 154 Protein sequence: 
Protein Accession S: BAA11503.1 



TGGTTTCftAA 
AGTGTGGGOC 
GCXGTCAATC 
TTTTTGAAAS 
AATGTATATG 
TGAATGCCAA 
ATAGTATTAA 
CCGTGTTGGC 
CXAT6TGCTG 
TGAGATTTGT 
GGGTGTGGTG 
TTGAGCTCAG 
TAGAAAAAAT 
AGGTtGGGAGG 
TGTACTOCAG 
GTCAATTTCT 
TGAATTTATA 
ATAAACAAGG 
TTTGTAGTTT 
GTATTTTAAA 
TGAACTTCTA 
ATCAACAGTC 



CTCCTGAOCT 
ACdSCGGCCA 
CACCTTCACT 
GACTGCCCTT 
TGGGTTTATT 
TACCATATTT 
TGTGTCATAT 
CAGGCTGTGT 
GGATEACAGG 
TTTGGCTATG 
GCTTATGCCT 
GAGTTCCAGA 
TAGCCAGGTG 
ATCACTTGAA 
CCTGGGTGAC 
ACAACAACAA 
TAAAACTGTT 
TCTGTCTTCC 
TCAGTGTACA 
ATTTCAAATT 
TCCT TCAGOC 
ATGTGTTCTA 



CAAGTGACCC 
GCCTATGATC 
TTTTCTTGGG 
TGCTCTATCA 
TCACGACTCT 
GTATGTAGTG 
TTTTGCTGTT 
TGAACTCCTO 
G6TGAGCCTT 
TTAAlGTOCTT 
GTAATCCTAG 
CCAGCCCGGG 
TGGTGGTGCA 
CCCXAGAGGT 
AAAGTGAGAC 
CAACAAAAAC 
G0GA6AATTG 
TAGGTATTAA 
GGTCTACCAT 
CTAACCACTT 
TTGCTAAACT 
TGAATAAAGA 



ACCTTGGCCT 
CATTTTGAAT 
AATATAGMCA 
OCTTTGCATT 
G'rrTTG'I TCC 
TATGTAATTT 
GTTTGTATTT 
AGCTAAAGCA 
GGTGCT6GCC 
TGCTTTTGAT 
AACTTTGGGA 
CCTATGGCAA 
TGCCTGTAGT 
CRAGACTGCA 
TCTATCTCAA 
OCCTGTTGGG 
ACaXCTTAAT 
TGTTTTGTCT 
GTCAGCATTT 
GTTGCTAGTA 
GTGftfiTTCTC 
GTTTTACTCC 



2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
25B0 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 



PCTAJS02/12476 



11 



21 31 41 51 

MFCEKAMELI RilJiRAPEGQ LPAPNEDGLR QVLEEMFALY BQNQSDVNEA KSGGRSDI.IP 
TIKFRHCSLL RHRRCTVAYL YDRIilAIRAL RVIEYGSVLPM ALRPKMAAEB KESfFlWYKRS 
LATYMRSLGG DBQLDITODM KPPKSLYIEV RCLKDYGEPE VCOGftSVULK KNSC2HPLPRW 
KCEQLIRQGV IiEHILS 

Seq ID NO: 155 ONA sequence 

Kucleic Acid Accession tt: Eos sequence 

Coding sequence i 149-709 



1 

1 

GTTCGGCGCC 
CGAAAGGAGT 
AAGGCCGCGG 
CGACSCroCAT 
AGTTCTGGAG 
GTCAGGTG6A 
AAATCGAOSC 
ATGGGAATAT 
GGAGTGGTTT 
TGAAGGTTTG 
ATGCAGTGGC 
CAACCTCCAC 
GCACTTCftGT 
AGCTGATCAG 
CAGGCTTCAC 
CCCTCTTTGA 
AAGTATAATT 
ACTCTTTTTT 
TCTCAAACTC 
AGGOSTGAGC 
CAGCATTCCT 
AATCTCTTAA 
TTTTCTGGTC 
AATCACATGC 
ACATACTAAT 
TCCACCATGG 
GAAGGGGACA 
GATTGACTGA 
GACATTTTAA 
GCAXOG6A6A 
AGTGTTATAA 
TTGTTTTTTC 
GTGCAGTGGC 
TGTCTCAGCC 
TGTATTTTTA 
CTCAAGTGAC 
CAGCCTATGA 
CTTTTTCTTG 
TTTGCTCTAT 
TTTCAGGACT 
TTGTATGTAG 
ATTTTTGCTG 
GTTGAACTCC 
GGGQTGAGCC 
TGTTAAGTCC 
CTGTAATCCT 
GACCAGCCCG 



11 
i 

AAAGCGCGGA 
GAGGOGCCGA 
GAGTGGGAAG 
OGCGCGCCCG 
GAGATGAAA6 
OGAAGTGATT 
TGCACTGTAG 
GGTAGGGTCT 
AATAATTATA 
GACATTACAC 
GCGATCTCGG 
CTCCCAGGTC 
CCTATTAAAA 
ACAAGGAGTC 
TCAACTCATG 
TTTTAGAAGC 
TGCTAACTAT 
GGTTTTGGTT 
CTGGCCTCAA 
CACTGCACCC 
ACAGTTGTTA 
ATAAGCACTC 
ATGTGTATTG 
AAGTGAAGAT 
TTATCATCTG 
TGGCTGGTGT 
GTGAAATTGG 
AAAGTCACAT 
ATTTTOATGA 
AATCTTTTCC 
TTTTAAGCTT 
GTTTGTTTCT 
GTGATCTTGG 
TCCCAAGTAG 
GTAGAGACAG 
CCAC CTTGG C 
TCCATTTTGA 
GGAATATAGA 
CACCTTTGCA 
CTGTTTTGTT 
TGTATGTAAT 
TTGTTTGTAT 
TGAGCTAAAG 
TTGG TGCTG G 
TTTGCTTTTG 
AGAACTTTGG 
GGCXTATGGC 



21 
I 

GCGGAGGCCG 
GAGCCCAGAT 
OGTCXGCCAT 
AAGGGCAACT 
CTTTGTATGA 
TGATACCAAC 
CATACCTGTA 
TGCCAAATGC 
AAAGATCTCT 
AGGATATGAA 
CTCAACCTGC 
CGGTGTCTAA 
AAAAATAGCC 
CTGGAGCACA 
GACTCXrrCTG 
TATAGACATT 
TAAGGACTTT 
TTGTTTTGTA 
GCAGTCCTCC 
GGCCCCTACT 
CAGTGTGTTT 
ACTTGGCTCG 
TACAAGCTAG 
GATGGTCTGT 
GCTATTTGGG 
GGCTTGTG6C 
GGCTAGAGAA 
GAAGAGTTGA 
AATCCAGTTT 
CATCCCAA6A 
TATACTTTGG 
TTGTTTTGAO 
CTGACTGCAA 
CTGGGATTAC 
AGTTTTACCA 
CTCCCAAAGT 
ATGAATTTTT 
TATCCAGCTG 
TTTTTGTTAA 
CCATTGACCT 
TTTCTAATAA 
TTTTTGTMA 
CAATACACTT 
CCCAGTGTAC 
ATGTGAAATT 
GAGGC CTAGA 
AAAACTOC39T 



31 

I 

AGGCGAGAGC 
ACCATTTTGG 
GTTCTGCGAA 
GCCTGCCTTC 
ACAAAACCAG 
TATCAAATTT 
TGAOCGCTTG 
ATTAOGATTT 
TGCTACTTAT 
ACCACCAAAA 
AACCTCCACC 
AAGACTATGG 
AGCACTTTTT 
TCCTGTCATG 
TACTCACTCT 
GTTTAAGATA 
CTTTTTTTAA 
GAGACTGTCT 
CACCTTAGCT 
CCTTTTTCTA 
TTTAAATGAA 
ACAGGAAGAA 
AGAGCTGAAT 
AGAAATTTTC 
AAGGAAGGAC 
TATGGGGTGA 
GGAACTTTGT 
TTGTCTTTTA 
ATTCGTTTGT 
TCACAATTTT 
TCTATGACCC 
ATGGAGTCTT 
TCTCTATGCC 
AGGCAGVGGC 
TGTTGGCCAG 
TTTGGGATTA 
TATATGGTGC 
TTTCACTACC 
AAAGTAGTTG 
G'rTTTTCTCT 
TTCTTGAAAC 
GATGGGGTTT 
GCCTOGTCCT 
CACATTTCTT 
TG06AACAGG 
TGGGTGGATC 
CTCTACAAAA 



41 
i 

CTGGCGCTGT 
G6T6A6ASCT 
AAAGCCATGG 
AAOGAGGATG 
TCTGATGTGA 
OGACACTGTT 
CTTOGGATCA 
CA.CATGGCT6 
ATGAGGTCAC 
AGCCTATATA 
TCCCAGGTTC 
AGAATTTGAA 
ACCTCGATGG 
ACCATGCGCC 
CTCCACCACT 
ACTAW3AATA 
TGTTGTACAC 
CACTATGTTG 
TCTCAAAGTG 
ATAAGCTGTA 
A6TAAACATC 
GGTAGATCCT 
TTCTGAGATA 
AGTATATATA 
ACACATGGAT 
TCACCAGTAT 
ACAGTTTTCC 
ATGGTATGTT 
TCTTTTATGC 
TTTTCCTTTT 
GTTTTTTTTT 
GTTCTGTCAC 
CTOGGTTCAA 
OGCCACX^CCT 
GCTGGTTTCA 
CAAGTGTGGG 
AAGGTGTCAA 
ATTTTTTGAA 
TCAATGTATA 
CCTGAATGCC 
AGKIAGTATt 
CACOGTGTTG 
CCCCATGTGC 
TTTGAGATTT 
CAGGGTGT6G 
ACTTGMQCTC 
AATA6AAAAA 



51 
I 

AG6ACTAGAA 
GGTGGTTGGC 
AACTGATC06 
GACTCAGACA 
ATGAAGCAAA 
CTCTGTTAAG 
GAGCACTCAG 
CTGAAGAAAT 
TGGGAGGAGA 
TTGAAGCTGG 
ACCTCAACTG 
GTTGATGATG 
AAATGTGAGC 
GAGGCACTTC 
CCCTTCACCT 
CTTGGCrAAfi 
TATTCTTCCT 
CCCAAGCTGG 
TTGACATCAC 
TCTGTAATCA 
GTTACATTTG 
GTGTGTCTTG 
CACATTTTCA 
ATGTTTAATG 
TTTGCACATT 
CACCACTTTG 
CPGAGATTCA 
TTAAACAGCr 
TTTGGGTGTT 
TACTTCTAGA 
TTGTTTTGTT 
CCAGGCTGGG 
GTCATTCTCr 
GGCTAATTTT 
AACTCCTGAC 
CCACCGCGGC 
TCCACCTTCA 
AGGACTGCCC 
TGTGGGTTTA 
AATACCATAT 
AATGTGTCAT 
GCCAGGCTGT 
TGGGATTACA 
GTTTTGGCTA 
TGGCTTATGC 
AGGAGTrrak 
ATTAGOCAG6 



60 
120 
180 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 



245 



rGSQGTO^cSx^^f^ GTCACAGTTA CAOGGCAGGC TGiVGGTCGGA GGRTOCTO 2880 

AAOCtXaUSAG GTCAAGACTG CAGTGAGCIG AGATCACACC ACTGTACTCC AfiCCTOGGTG 2940 

ACAAAGTGAG ACTCTATCIt: AAAAAGAAAT TAGGATCAAT TTGTCRATTT CTACAACAAC 3000 

AACAACAAAA ACCCCTGTTG GGCACCTPGA TTGAGATTGC ATTGAATTTA TATAAAACTG 3060 

TTCGGACAAT TCACATCTTA ATAATATTGA GTCTTCTGGC CTATAA ACAA GGTCTGTCTT 3120 

CCTAGGTATT AATGTTTTGT CTTCTATTTC TCTTAATAAT CrTTT GTAGT TTTCAGTGTA 3180 

CAGGTCTACC ATCTCAGCAT TTCATAGTTT TGATGCTAAA TGGTATTTTA AAATTTCAAA 3240 

TTCTAACCAC TTOTTCCTAG TAAATAGAAA TACAATTGAT GTTSAACTO TA TCCTTCAO 3300 

CCTTGCTAAA CrGTOAGTTC TCATGGTGTT TTTGTAAATT ACATCAACAG TCAtGTGTTC 3360 
TATGAATAAA GACTTTTACT CCTTC 

Seq ID KO: 156 Procein sequence: 
Protein Accession ft: Eos secpience 

1 11 21 31 41 51 

itPCEKAMELI REliHHAPEGQ LPAFNBDGLR QVLEEMKALlf EQMQSOVNEA KSGGRSDLIP €0 

TIKPRHCSLL RNRRCTVAYl. YDRLLRIHAL RHEY6SVLPN ALRFHKAAEB MEMFNNYKRS 120 

LATVMRSLC3G DEGLDITXJDM KPPKSLYIEA GCSGAISAQP ATSTSOVHLII CKimPGPVS 180 
RRLWRI 

Seq ID NO: 157 SNA sequence 

micleic Acid Accession fi: Eos sequence 

Coding sequence: 148-621 

1 11 21 31 41 51 

TTCGGOGOCA AAGOGOGGAG CGGAGGCCC5A GGCGAGAGCC TGGCGCTGTA GGACTAGAAC' 60 

GAAAGGAGTG AGGCG(XX3AG ACCCCACATA CCATTTTGGC GXGAGAGCTG GTQCSTTCGCA 120 

AGGCCGCGGG AGTGGGAAGC GTCCGCCATG TTCTGCGAAA AAGCCATGGA ACTGATCCGC 180 

GAGCTGCATC GCGCGCCCGA AGGGCAACTG CCTGCCTTCA ACGAGGATGG ACTCAGACAA 240 

' GTTCTGGAGG AGATGAAAGC TTrGTATGAA CAAAACCAGT CTGATGTGAA TGAAGCAAAG 300 

TCAGGTGGAC GAAGTCATTT GATACCAACT ATCAAATTTC GACACTGTTC TCTGTTAAGA 360 

AATCGAOGCT GCACTGTAGC ATACCTGTAT GACOGCTTGC TTCX5GATCAG AGCACTCAGA 420 

TGGGAATATG GTAGOGTCTT GOCAAATGCA TTACGATTTC ACATGGCTGC TGAAGAAGTC 480 

CGGTGTCTAA AAGACTATGG AGAATTTGAA GTTGATGATG GCACTTCAGT OCTATTAAAA 540 

AAAAATAGCC AGCACTTTTT ACCTCGATGG AAATGTGAGC AGCTGATCAG ACAAGGAGTC 600 

CTGGAGCACA TCCTGTCATC ACCATGGGCC GAGGCACTTC CAGGCTTCAC TCAACTCATG 660 

GACTOCTCro TACrCACTCT CTCCACCACT CCCTTCACCT CCCTCTTTGA TTTTAGAAGC 720 

TATAGACATT GTTTAAGATA ACPAAGAATA CTTGGCTAAG AAGTA TAATT TGCTAACTAT 780 

TAAGGACTTT CTTTTTTTAA TGTTGTACAC TATTCTTCCT ACTCTTTTTr GGTTTTGGTT 840 

TTGTTTTGTA GAGACTGTCT CACTATGTTG CCCAAGCTGG TCTCAAACTC CTGGOCTCAA 900 

GCAGTCCTCC CACCTTAGCT TCTCAAAGTG TTGAGATCAC AGGCGT6AGC CACTGCACCC 960 

GGCCCCTACT CCTTTTTCTA ATAAGCTGTA TCTGTAATCA CAGCATTCCT ACAGTTGTTA 1020 

CAGTGTGTTT TTTAAATGAA AOTAAACATG GTTACATTTG AATCTCTTAA ATAAGCAGTC 1080 

ACTTGGCtGG ACAGGAAGAA GGTAGATCCT GTGTCTCTTG TTTTCTGGTC ATGTGTATTG 1140 

TACAAGCTAG AGAGCTGAAT TTCPGAGATA CACATTTTCA AATCACATGC AAGTGAAGAT 1200 

GATCGTCTGT AGAAATTTTC AGTATATATA ATGTTTAATG ACATACTAAT TTATCATCTG 1260 

GCTATTTGGG AAGGAAGGAC ACACATGGAT TTTGCACATT TCCACCATGG TGGCTGGTGT 1320 

GGCTTGTGGC TATGGGGTGA TCACCAGTAT CACCACTTTG GAAGGGGACA GTCAAATTGG 1380 

GGCTAGAGAA GGAACTTTGT ACAGTTTTCC CTGAGATTCA GATT GACT GA AAAGTCACAT 1440 

GAACA6TTX3A TTGTCrTTTA ATGGTATGTT TTAAACAGCT GACATTTTAA ATT TTGATGA 1500 

AATCCAGTTT ATTCGTTTGT TCTTTTATGC TTTGGGTGTT GCATCCGAGA AATCTTTTCC 1560 

CATCCCAAGA TCACAATTTT TTTTCCTTTT TACTTCTAGA AGTGTTATAA TTTTAAGC TT 1620 

TATACTTTGG TCTATGACCC GT ' m " m"iT TTCTTTTGrr TTGTTrTTTC GTTTCTTTCT 1680 

TTGTTTTGAG ATGGAGTCTT GTTCTGTCAC CCAGGCTGGG GTGCAGTGGC GTGATCTTGG 1740 

CTCACTGCAA TCTCTATCCC CTGGGTTCAA GTGATTCTCT TGTCTCAGCC TCCCAAGTAG 1800 

CTGGGATTAC ACGCACAGGC CGCCACGCCT GGCTAATTTT TGTATTTTTA GTAGAGACAG 1860 

AGTTTTACCA TGTTGGCCAG GCTGGTTTCA AACTCCTGAC CTCAAGTGAC CCACCTTGGC 1920 

CTCCCAAAGT TTTGGGATTA CAAGTGTGGG CCACOGCGGC CAGCCTATGA TCCATTTTGA 1980 

ATGAATTTTT TATATCGTGC AAGGTGTCAA TCCACCTTCA CTTTTTCTTG GGAATATAGA 2040 

TATCCAGCTG TTTCACTACC ATTTTTTGAA AGGACTGCCC TTTGCTCTAT CA CCTTTG CA 2100 

TTTTTGTTAA AAAGTAGTTG TCAATGTATA TGTGGGTTTA TTTCAGGACT CTCTTTTCTT 2160 

CXaVTTCACCT GTTTTTCTCT CCTGAATGCC AATACCATAT TTGTATGTAG TGTATGTAAT 2220 

TTTCTAATAA TTCTTGAAAC AGATAGTATT AATGTGTCAT ATTTTTGCTG TTGTTTGTAT 2280 

TTTTTGTACA GATGGGGTTT CACOGTGTTG GCCAGGCTGT GTTGAACTCC TGAGCTAAAG 2340 

CAATACACTT GCCTCGTCCT CCCCATGTGC TGGGATTACA GG06TGRGCC TTGGTGCTGG 2400 

CCCAGTOTAC CACATTTCTT TTTGAGATTT GTTTTGGCTA TGTTAflGTCC riTGCTTTTG 2460 

ATGTGAAATT TGGGAACAGG CAGGGTGTGG TGGCTTATGC CTGTAATCCT AGAACTTTGG 2520 

GAGGCCTAGA TGGGTCGATC ACTTGAGCTC AGGAGTTCCA GACCAGCCCG GGCCTATGGC 2580 

AAAACTCCGT CTCTACAAAA AATAGAAAAA ATTAGCCAGG TGTGGTGGTG CATGCCTGTA 2640 

GTCACAGTTA CACGGCAGGC TGAGGTGGGA GGATCACTTG AACCCCAGAG GTCAAGACTG 2700 

CACTGAGCTG AflATCACACC ACTGTACTCC AGCCTGGGTG ACAAAGTGAG ACTCTATCTC 2760 

AAAAAGAAAT TAGGATCAAT TTGTCAATTT CTACAACAAC AACAACAAAA ACCCCTGTTG 2820 

GGCACCTTGA TTCAGATTGC ATTGAATTTA TATAAAACTG TTGGGAGAAT T GACATCT TA 2880 

ATAATATTGA GTCTTCTGGC CTATAAACAA GGTCTGTCTT CCTAGGTATT AATGTTTTGT 2940 

CrTCTATTTC TCTTAATAAT CrTTTGTAGT TTTCAGTGTA CAGGTCTACC ATGTCAGCAT 3000 

TTCATAGTTT TGATGCTAAA TGGTATTTTA AAATTTCAAA TTCTAACCAC TTGTTGCTAG 3060 

TAAATAGAAA TACAATTGAT GTTGAACTTG TATCCTTCAG CCTTGCTAAA CTGTGAGTTC 3120 

TCATCGTGTT TTTGTAAATT ACATCAACAG TCATGTGTTC TATGAATAAA GAGTTTTACT 3180 
CCTTC 

Seq ID NO: 158 Protein sequence: 
Protein Accession «: Cos sequence 



21 31 41 51 

11(1 



.246 



wo 02/086443 

KFCEKAMSLI BE££RAFEGQ LPAFHEDGLR QVIfEOfKALY EG^IQSDVKEA K9GG R S DL IP 60 
TIKFRHCSLL HN2HCTVAYL YDRLLRIBAL RWEYGSVLPH ALRPHMAAEB VHCLKDYGEF 120 
EVDDGTSVLL KKHSQHFLPR NKCBQLXHQG VLEEILS 



Seq ID 210: 159 DUA sequence 

Nucleic Acid Accession ft: Eos sequence 

Coding sequence: 149-229 

1 11 21 31 41 51 

{ 1 I I i i 

GTTOGGOCCC AAAGCGOGGA GOGGAGGCCG AGGCGAGAGC CTGGCGCTGT AGGACTAGAA 60 
CGAAAGGAGT GAGGOGCCGA GAGCCCAGAT ACCATTTTGG OGTGACAGCT GGTGGTTGGC 120 
AAGGCCX50GG GAGTGGGAAG CGTCOGCCAT GTTCTGOGAA AAAGCCATGG AACTGATCOG 180 
CGAGCTGCAT OGOGOGCCCG AAGGGCAACT GCCTGCCTTC AACAATTAGC TGGGTGTGGT 240 
GGCAC3«:ACC TGTAGTCCCA GCAACTTAGG AGGCTGAAC5T GAGAGGATTG CATGGCTCCA 300 
GGAAGTTGAA ACTGCAGTGA ACTGTGGTCA CGCTATTACA CTCCAGCCTG GGTGACAGAC 360 
TGAATCCCTG TCTCAAAAAG GAAAAGGAGG ATGGACTCAG ACAAGTTCTG GAGGAGATGA 420 
AAGCTTTGTA TGAACAAAAC CAGTCTGATG T G TTCTCTGT TAAGAAATCG ACGCTGCACT 480 
GTASCATACC TGTATGACOG CTTGCTTCGG ATCAGAGCAC TCAGATC5G 



Seq ID NO; 160 Protein sequence: 
Protein Accession S: Eos sequence 

1 11 21 31 41 51 

ill))) 
ATCTTCTGCG AAAAAGCCAT GGAACTGATC CGCGAfiCTGC ATOGOSCGCC OGAAGGGCAA 

CTGOCTGCCT TOVACAATTA G 



Seq ID NO: 161 DNA sequence 
Nucleic Acid Accession ft: U106S4 
coding sequence: 1333-2280 

1 11 21 31 41 51 

GGATCOGGCC GGATCTCAGG GAGGTGAGGA CTTTGTTCTC AGAGGGTGTG TGTGGACAAA 60 

ACAGGGAGGC CCTGTCTTOG ACAGACACAG TGGTCCCAGG ATTGGAGAGC AGTCCAGGTG 120 

AGGAACCTAA GGGAGGATCG AGGGTACCTC CAGGCCAGAG AAACTCTCAG ATCAAfiAGAG 180 

TTTGCCCTGC CCCTACTGTC ACCCCAGAGA GCCGGGGCAG GGCTGTCTGC TGAGGTCCCT 240 

CCTTTATCCT GGGATCACTG GTCTCGGGGA GGGCTGGCCT TGGTCTGAGG GGGCTGCACT 300 

CACGTCAGCA GAGGGAGGGT CCCAG6CCCT GCCAGGAGTC CAGGTGCAGA CTGAGGGGAC 360 

CCCACTCACC AAACACAGAG GACCTAGCCC CACCCTGCCC CTTGTGTCAG CTGAGGGAAG 420 

CCGCTGGGTG GATGGACTCC CCTCACTTCC TCTTCAGGTG TCTCCTGGAG ATAGGGCCTC 480 

AGGTCAACAG AGGGAGGGTT CCAGACCCTG CAQ6CATCAA GATGAGGACC AGGCAGTATC 540 

CTCACCCCAS GACACATGGA CCCCATTGAA TTTAGACATC TCTTACTGTA CTTCCGAGGA 600 

AACCCTGGGC AGGTGTGGGC AGATGTTGGT TGGGGCATGT CCTTCTGTTC CATATCAGGG 660 

ATGTGAGCTC CTGATCTGAG AGACTCTCAG GCAAGTAGAG GAGTAGAGTC CAGTCCCTGC 720 

CAGGAGAAAG GTCAGGGCCC TGAGTGAGCG CAGAGGGGAC CATOCACCOC AAAAGTGTOT 780 

AGAACTCAAG AGTGTCCAGC CCGCCCTCTT GACAGCACTG AGGGACCGGG GCTCTGCCTG 840 

CAGTCTGCAG CCTAAGGGCC CCTCGATTCC TCTTCCAGGA GCTCCAGGAA GCAGGCAGGC 900 

CTTGGTCTGA GACAGTGTCC TCAGGTCGCA GAGCAGAGGA GACCCAGGCA GTGTCAGCAG 960 

TGAAGGTGAA GTGTTCACCC TGAATGTGCA CCAAG6GCCC CACCTGCCCC A6CACACATG 1020 

GGACCCCATA GCACCTOGOC CCATTCCCCC TACTGTCACT CATAGAGCCT TGATCTCTGC 1080 

AGGCTAGCTG CACGCTGAGT AGCCCTCTCA CTTCCTOCCT GAfiGTTCTCG GGACAGGCTA 1140 

ACCAGGAGGA CAGGAGCCCC AAGAGGCCCC AGAGCAGCAC TGAC6AAGAC GTGTAAGTCA 1200 

GCCTTTGTTA GAACCTCCAA GGTTCGGTTC TCAGCTGAAG TCTCTCACAC ACTCCCTCTC 1260 

TCCCCAGGCC TGTGGGTCTC CATCGCCCAG CTCCTGCCCA CGCTCCTGAC TGCTGCCCTG 1320 

ACCAGAGTCA TCATGTCTCT CGAGCAGAGG AGTCCGCACT GCAAGCCTGA TGAAGACCTT 1380 

GAAGCCCAAG GAGAGGACTT GGGCCTGATG GGTGCACAGG AACCCACAGG CGAGGAGGAG 1440 

GAGACTACCT CCTCCTCTGA CAGCAAGGAG GAGGAGGTGT CTGCTGCTGG GT CATC AAgT ISOO 

CCTCCCCAGA GTCCTCAGGG AGGCGCTTCC TCCTCCATTT CCGTCTACTA CACTTTATGG 1560 

AGCCAATTCG ATGAGGGCTC CAGCAGTCAA GAAGAGGAAG AGCCAAGCTC CTCGGTOGAC 1620 

CCRGCTCAGC TGGAGTTCAT GTTCCAAGAA GCACTGAAAT TGAAGGTGGC TGAGTTGGTT 1680 

CATTTCCTGC TCCACAAATA TCGAGTCAAG GAGCCGGTCA CAAAGGCAGA AATGCTGGAG 1740 

AGCGTCATCA AAAATTACAA GOGCTACTTT CCTGTGATCr TCGGCAAAGC CTCOGAGTTC 1800 

ATGCAGGTGA TCTTIGGCAC TGATGTGAAG GAGGTGGACC CC6CCGGCCA CTCCTACATC I860 

CTTGTCACTG CTCTTCGCCT CTCGTGCGAT AGCATGCTGG GTCATGGTCA TAGCATGCCC 1920 

AAGGCCGCCC TCCTGATCAT TGTCCTGGGT GTGATCCTAA CCAAAGACAA CTGCGCCCCT 1980 

GAAGAGGTTA TCTGGGAAGC GTTGAGTGTG ATGGGGGTGT ATGTTGGGAA GGAGCACATG 2040 

TTCTACGGGG AGCCCAGGAA GCTGCTCACC CAAGATTGGG TGCAGGAAAA CTACCTGGAG 2100 

TACG6GCA6G TGOOOGGCAG TGATCCTGCG CACTACGAGT TCCTGTGGGG TTCCAAGGCC 2160 

CAOGCTQAAA OCAGCTATGA GAAGGTCATA AATTATTTGG TCATGCTCAA TGCAAGAGAG 2220 

CCCATCTGCT ACCCATCCCT TTATGAAGAG GTTrTGGGAG AGGAGCAAGA GGGAGTCTGA 2280 

GCACCAGCCG CA6CCGGGGC CAAAGTTTGT GOGGTCAGGG CCCCATCCAG CAGCTGCCCT 2340 

GCCCCATGTG ACATGAGGCC CATTCTTCGC TCTGTGTTTG AAGAGAGCAA TCAGTGTTCT 2400 

CAGTGGCAGT GGGTGGAAGT GAGCACACTG TATGTCATCT CTGGGTTCCT TGTCTATTGG 2460 

GTGATTTGGA GATTTATCCT TGCTCCCTTT TGGAATTGTT CAAATGTTCT TTTAATGGTC 2520 

AGTTTAATGA ACTTCACCAT CGAAGTTAAT GAATGACAGT AGTCACACAT ATTGCTGTT T 2580 

ATGTTATTTA GQAGTAAGAT TCTTGCTTTT GAGTCACATG GGGAAAT OOC TGTTATTTTG 2640 

TGAATTGGGA CAACATAACA TAGCAGAG6A ATTAATAATT TTTTTGAAAC TTGAACTTAG 2700 

CAGCAAAATA GAGCTCATAA AGAAATAGTG AAATGAAAAT GTACrTAATT CTTGCCTTAT 2760 

ACCTCTTTCT CTCrCCTGTA AAATTAAAAC ATATACATGT ATACC TGGAT TTGCTTGGCT 2820 

TCTTTGAGCA TGTAAGAGAA ATAAAAATTG AAAGAATAAT TTTTCCTGTT CACTGGCTCA 2880 
TTTTTTCTTC AGACACGCAC TGAACATCTG TTATTCXSGAA CACCCTGGGT T 



Seq ID NO: 162 Protein sequence: 
Protein Accession ft: AAA68877.1 



247 



wo 02/086443 



1 11 21 31 41 51 

I i I 1 I 1 

MSLSQRSPHC KPDn)LEAQG EDUGXiMGAQE PTGEEEETTS SSDSKSEEVS AAGSSSPPQS fiO 

PQGGASSSZS VWTLWSQEO EGSSSQBEBE PSSSVDPAQL EFMFQSAL KL KVABLVHFLb 120 

HKYRVKEPVT KAEMZiESVZK NYKRYPPVIF GKASEFKQVI FGTDVKEVDP AGKSYI LVTA 180 

LGLSCDSMLG DCTSMPKAAL LIIVLGVILT KDUCAPEBVI WEALSVMGVY VGKEHMPYGE 240 

PRKLLTQDWV QENYIiEyRQV PGSDPAHYBP LKGSKAHAET SYEKVIHYLV MLNARBPICY 300 
PSLYEEVIiGE BQBGV 

Seq ID NO: 163 DNA sequence 
Kudeic Acid Accession 8: AF292100 
Coding sequence: 30-809 

1 11 21 31 41 51 

t ) I 1 i 1 

GGGGGGGGAG AGC3CCTGGAG GACACCAACA TGAACAAGTT GAAATCATCG CAGAAGGATA 60 

AAGTTOGTCA GTTTATGATC TTCACACAAT CTAGTGAAAA AACAGCAGTA AGTTGTCTTT 120 

CTCAAAATGA CTOGAAGTTA GATGTTGCRA CAGATAATTT TTTCCAAAAT CCTG AACTTT 180 

ATATACGAGA GAGTGTAAAA GGATCATTGG ACAGGAAGAA GTTAGAACAG CTGTA^ATA 240 

GATACAAAGA CCCTCAAGAT GAGAATAAAA TTGGAATAGA TGGCATACAG CAGTTCTGTG 300 

ATGACCTGGC ACTOGATCCA GCCAGCATTA GTGTGTTGAT TATTGOGTGG AAGTTCAGAG 360 

CAGCAACACA GTGCGAGTTC TCCAAACAGG AGTTCATGGA TGGCATGACA GAATTAGGAT 420 

GTGACAGCAT AGAACAACTA AAGGCCCAGA TACCCAAGAT GGAACAAGAA TTGAAAGAAC 480 

CAGGACGATT TAACGATTTT TACCAGTTTA CTTTTAATTT TGCAAAGAAT CCAGGACAAA 540 

AAGGATTAGA TCTAQAAATG GCX»TTGOCT ACT6GAACTT AGTGCTTAAT GGAAGATTTA 600 

AATTCTTAGA CTTATGGAAT AAATTTTTGT TGGAACATCA TAAA06ATCA ATACCAAAAG 660 

ACACTTGGAA TCTTCTTTTA GACTTCAGTA CGATGATTGC AGATGACAtG TCTAATTATG 720 

ATGAAGAAGG AGCATGGCCT GTTCTTATTG ATGACTTTGT GGAATTTGCA CGCCCTCAAA 780 

TTCCTCGGAC AAAAAGTACA ACAGTGTAGC ACTAAAGGAA CCTTTTAGAA TGTACATAGT 840 

CTGTACAATA AATACAACAG AAAATTGCAC AGTCAATTTC TGCTGGCTGG ACTGAACTOA 900 

AGATCAATCC TCACAATTCA GACTGAGGGT TGAGACAAAA CTTTAAGGAT ACATCTTGGA 960 

CCATATCGTA TTTCATTCTT CTAATGGTGG TTTGGGCTTG TCTTCTAGTC TGGGC OGCTC 1020 

TAAACATTTA TAATTCCAAC ATTGTGGATT TCATCTTATA TCIGTGGAOC ATOCTAGTTT 1080 

ATTCTCCCAT AAGTCTTAGA AGCTTTATGG TGATEATTTT GAGGTTTTCA TTCTOGCATA 1140 

AAGCACAATG CTGTCTTCAT CAGAAAACAG TTGGCATAAG AATTAAACAT ATGAACATCA 1200 

CAAAACAATT TATAAAAACT TCTTAAATAT ACGCTTTGGG CTAGTTGCAA AGAC TATGC T 1260 

AATAGCACTT CCAGTGAGAG TGATATATTT AAGTGTACTG GATCTGGAAT GGTGTTTTGG 1320 

TTTGGGGGGA ATTTTTTTTT TTTCCTGGCA AATCACATAT GTTGTTGATG TGAGTATCTG 1380 

ATGAAAAAAC AATGTCAGAA TAACC5GACAT GAAAATTTTT TAGGATAACT TGGTGCCTAC 1440 

ctgaaaaatg tattgtgttt TAGACTCTTG ATTTCAAAAG GTTCCACAGA ACTAGTCTGC ISOO 

GCTTACCTTA CCCATGTTTA TATATAGCTG TCCTACAGGG AGCTTTTATT TAGAAAATGT 1S60 

CTGCATAATG TTAGATTCTT CTCCTGTCTA CATTATGCAC TACATAATTG GACTTCATTA 1620 

TGCTTTTGAA ATGCTTATCT GCCTGTCACa TAAGTTAAAC TATTTAATTT GTTTTGAATG 1680 

TTTTGGATTG CTACACAATA CAATATTCTA AATTXAGGCA TGAGGGTTTT TTTGTTTTAT 1740 

TTTTACTTTT TTTTTGTCAT TGCACTATGG AACACAAATG AAATTCTCTT AA TTTATAA G 1800 

AAGATAGTAG GAGTTAAATT TTCAAAATGG TTGTGATGAG CCAOSAAATT C AATumxAT I860 

AATATAGGTA CTGCTCTTTC AGACAAACAG TCCATTTTTA ATGACTTCTT ATTTTGTTGA 1920 

AATTACTTTA ACTGCTAATC ACTGTGGTTG CCAAATATTT ACTTCAGAAG CAAAGATTTT 1980 

CAAACAAGCA TACACGATGC AAAATACCAG TCTGGCTTCT AGTCTATTTA CT GTTT TGTT 2040 

TCACXCAGAT TAGCTCAGTT TTCTCATCAA AGCAGAATGC TATCTTGCGT GTGTGTGTGT 2100 

GTGTGTGIGT GTOTgrOI G T GTATGTGTGT ATATATATAT ATATA TATAT ATATATATTT 2160 

' rrrm r i TT ttttttttaa attacaaaag ccatgagctg cttttatgct gaaaatggtc 2220 

ATTTCCCTGT TCACTTACTG ACATGTGAAG AAGGGTTTCT TGCTTTCTTA AACATTTCOG 2280 

TAAGGCAGGC TAGAAATGTA ATACTTCAAA TGTTTGATGA TTATGGTCTT TTGATAGGAA 2340 

TAGATTCT6C TTCGGATATA TATCCAGGCA CTCTCTAAGG TCTAGGGTTG ATATTAACAA 2400 

AGGAATGTAC TTAGAATAGC AGTACATTTT ATGCAAATAT GGAAATTATT TTAAGAAACA 2460 

ATGACATATC AAAACTGCTT TTPACATGAT TTTGAAATA6 ACTAGAAAGC TTTCCCTATA 2520 

GACATATTAA TATTCCAATC ATAACTTTAA TTCAAGAATG CACTTTTACC AAAW3AAAAA 2580 

TTTGAAAATT TCTATTCAGG CTACTGGAAT TGGTTATTAA AAGAAAAAGG AAAAAGAAGA 2640 

ATCTTCCTGC TTTCAGTATT TCCTGATTTT TTTGTAAATA TAAAGAGGAA CTTC AATTAT 2700 

GAAAAATTTT TAAAAGATAT ATATATCTAT ATATCTATAT ATATGTACTG TTTTGTTTCC 2760 

TGTCTTGAAG ATTTTGAGTT ATGGTTATTG GTTTCAGATT GATTAATTCA CAT ATGCT GT 2820 

GTTTTCTTTA AAAGTCATAT GGGTTCGTGG CCTAATGCCT TGGATTTTAC ATATTTTTCT 2880 

TTTTAAATGC AAAACCTTTT CAACAAAATA GTGTTTGTCA TCAGGTTGGT ACTAAACATT 2940 

TATAATTACT GTGTAATTAT AAACAAAAAT ACATAAAGCT TTGAATATAA TTATGTAGCA 3000 

TAAAAGTTAA GGTTCTTCAC TATGATGGCA TCTTAGAATT AAACAAAACT TTTACTAGGG 3060 

CTGAAAAGAG AAGACTGATT TAATGTGGTG TGATTATTCT GAAGATAAAT GTCTGGCTAC 3120 

AGGGAATATT TTGTACTAAA AAATGATTAC ACATATGGCT GTGT6TGTTT GAGTCTGTGT 3180 

CTOXGAGAGA GCCAGAGAGA GTGAGAGAGA TTGACA6AGA AAGGGAGAGA CACACACACG 3240 

CCCCTTGAAT TGCTTTAACT CCTAAGTGTT TCAGTCCTCA TTCCGGTAAA CTCCCCATGC 3300 

TGATTCTTTG ■TrTTAAACTG AACCATAGGT ACAGTTTCCT TTPTGCCAAA TGTCAAAACA 3360 

GGTACAAATT TTAAAATGTA ATGCTTTTTA AATAGAAAAA TGTATAAAAT TAGAAGTGCC 3420 

CACATATAAA AAATACTTGA GATGAAGATT ATCTTTAGTG AATATCATCT GCATATCTCT 3480 

GTAAGTTCAA TTGTGTTTCT TACAGTCOCT GTCATATTAC CAACAGAOGC AATAAAAGCT 3540 
GCAGT6AAAT TG 

Seq ID NO: 164 Protein sequence: 
Protein Accession ff: AAG00606 

1 11 21 31 

I i I I 

MKXUCSSQKD KVSQFMIFIQ SSEKTAVSCL SQNDWKLDVA 
ORKXLEQLYN RyiCDPQDENX ZGIOGZQQFC DOLALOPASI 
BFMDGMTELG CDSZEQLKAQ IPXMEQEIiKE PGRFXDFYQP 
YHMLVLKGRP RPU3LKNKFL liEHHKRSZPK DTNKLLU3FS 
ODFVEPARPQ lAGTKSTEV 



41 51 
1 I 

TONFPQNPEL YIRE5VKGSL 60 

SVIiIIAHKFR AATQCEFSKQ 120 

TFNFAKNPGQ K6LDLEKAIA 180 

TMIADDMSHY DEEGAWPVLI 240 
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Seq ID KO: 165 WA sequence 
Kucleic Acid Accession #i AP25621S 
Coding sequence: 220-2028 

1 U 21 31 41 SI 

lit! I I 

CTCCAGTOOG CMtSCTCftGT AGCPGCTGCC GG0GGGGCT6 OGGGGOGGCG TCCGCTGCGC 60 
GCCTAOGGGC TGCGGTGGOS GCO6O0G0GG CACCOGGCAC GGOCOGCCAG TCCCCGCTTC 120 
CCTGCTCCAG AGCCGCCGCC TGGGOCGGGG CAGGGCGGGC CXX3GGGCTCC TCCATGCTGC 180 
CAGCCGCCGG GCTGCGGAGC CGACC3UVGTG GCTCCTGCGA TGGCGGCGGA A GftGG AGGCT 240 
GCGGCGGGAG GTAAAGTGTT GAGAGAGGAG AACCAGTGCA TTGCTCCTGT GGTTTCXIAGC 300 
OGarnSAGTC CAQGGACAAG ACXAACAGCT ATGGGGrrCTT TCAGCTCACA CATGACAGAG 360 
TTTCCACGAA AAOSCAAAGG AAGTGATTCA GACCCATCCC AAGTGGAAGA TGGTGAACAC 420 
CAAGtTAAAA TCAAGGCCTT CAfiAGAACCT CATAGCCAAA CTX3AAAAGGG GAGGAGAGAT 480 
AAAATGAATA ACCTGATTGA AGAACTGTCT GCAATGATCX: CTCWGTGCAA OC CCATG GCG 540 
CGTAAACTGG ACAAACTTAC AGTTTTAAGA ATGGCTGTTC AACACTTGAG ATCTTTAAAA 600 
GGCTTGACAA ATTCTTATGT GOGAAGTAAT TATAGACCAT CATTTCTTCA GGATAATGAG €60 
CTCAGACATT TAATCCTTAA GACTGCAGAA GGCTTCTTAT TTGTGGTTGG ATGTGAAAGA 720 
GGAAAAATTC TCrrCGrTTC TAAGTCAGTC TCCAAAATAC TTAATTATGA TCAGGCTAGT 780 
TTCACTOGAC AAAGCTTATT TGACTTCTTA CATCCAAAAG ATGTPGCJCAA AGTAAAGGAA 840 
CAACTTTCTT Cm T G ATAT TTCACCAAGA QAAAAGCTAA TAGATGCCAA AACTGGTTTG 900 
CAACTTCACA GTAATCTCCA OGCTGGAAGG ACACX3TGTGT ATTCTGGCTC AAGACGATCT 960 

TTTTTCTCTC GGATAAAGAG TTGTAAAATC TCTGTCAAAG AAGAGCATGG ATG CTTA CCC 1020 

AACTCAAAGA AGAAAGAGCA CAGAAAATTC TATACTATCX: ATTGCACTGG TTACTTGAGA 1080 

AfiCTGGCCTC CAAATATTGT TGGAATGGAA GAAGAAAGGA ACAGTAAGAA AGACAACAGT 1140 

AATTTTACCT GCCTTGTQGC CATTGGAAGA TTACAGCCAT ATATTGTTCC ACAGAACAGT 1200 

GGAGAGATTA ATGTGAAACC AACTGAATTT ATAACCCGGT TTGCAGTGAA TGGAAAATTT 1260 

GTCTATGTAG ATCAAAGGGC AACAGCGATT TTAGGATATC TGCCTCAGGA ACTTTTGGGA 1320 

ACTTCTTGTT ATGAATATTT TCATCAAGAT GACCACAATA ATTTGACTGA CAAGCACAAA 1380 

GCAGTTCTAC AGAGTAAGGA GAAAATACTT ACAGATTCCT ACAAATTCAG AGCAAAAGAT 1440 

GGCTCTTTTG TAACTTTAAA AAGCCAATGG TTTAGTTTCA CAAATCCTTG GACAAAAGAA IS 00 

CTGGAATATA TTGTATCTGT CAACACTTTA GTTTTGGGAC ATAGTGAfiCC TGGAGAAGCA 1S60 

TCATTTTTAC CTTCTAGCTC TCAATCATCA GAAGAATOCT CTAGACRGTC CTGTATGAGT 1620 

GTACCTGGAA TGTCTACrGG AACAGTACTT GGTGCTGGTA GTATTGGAAC AGATATTGCA 1680 

AATGAAATTC TGGATTTACA GAGGTTACAG TCTTCTTCAT ACCTTGATGA TTCGAGTCCA 1740 

ACAGGTTTAA TGAAAGATAC TCATACTGTA AACTGCAGGA GTATGTCAAA TAAGGAGTTG 1800 

TTTCCACCAA GTCCTTCTGA AATGGGGGAG CTAGAGGCTA CCAGGCAAAA CCAGAGTACT 1860 

OTTGCTGTCC ACAGCCATGA GCCACTOCTC AGTGATGGTG CACAGTTGGA TTTCGATGCC 1920 

CTATCTCACA ATGATGACAC AGCCATGGCT GCATTTATGA ATTACTTAGA AGCAGAGGGG 1980 

GGCCTGGGAG ACCCTGGGGA CTTCAGTGAC ATCCAGTGGA CCCTCTAGCC TTTGATTTTT 2040 

AACTCCAAAA ATGAGAAACA TTTTAAAGCA TTATTTAGGA AAAAACTGTC TCAACTATTC 2100 

TTAAGTACTG TATTGATATT GTTTGTATCT TTTATTAATG TTCTACCACT TTITATAfiAT 2160 

TTGCATCTTC CTGTCACAGG GATGTGGGGA AATACGTTTT CCTCCCAAGA GAACCAAGTT 2220 

TATTATAGAC TCCTTTATTC AGTGAAATGG CTTATAATCC ACTAGTTGCC ATATTTTTGC 2280 

TAAAATATTT CTAACCAAGA ATACTACTTA CATATT6TTT TGGC TTTG TT TTATTTTTGA 2340 

TGCACyrPTTT TTTAGTTGAG GTAATGTAAT ATATT6ATOT TrrCCmGT GTCTAAGATT 2400 

GATTTATAAT AGTAGGTTTG TATAATTTGO AACATTTTCC ATGCCTTGCG AATTTCCTTA 2460 

ATTGAGGATA GGGCTTACAC ACTTTAAGAA AACAGTGAGT ACTTGAACAT TTAAAGGGAC 2520 

AGTGCAATTT ATAGTCATAA TCACATTGAA TACTGTATTT GATCTTTGGA GACTTAGGCA 2580 

AGCACAGAGC TGGGATATTT ATGCTCAGTT GAGCACTTTA AGATGAATTT TAAGTGAGAT 2640 

GATTTCTTGC TTAAAACTCA GAAAGTCAAA AGAGTTTCAG CTTTCCTTAC AGAAAAGGAA 2700 

GGATCTTGGG CCCTAGATCT TGGGGATTAA CCTCTGCATA TAAGATTTAC TCTTAATAGG 2760 

CCAGACGTGG TGCTCACGCC TGTAATCCCA GTACTTrGGG AGGCTGAGAC GGGCAGATCA 2820 

CTTGAGGTCA GGAGTTCAAC ACCAGCCTGG CCAATATGGT GAAAOCCOCT TTCTACTAAA 2880 

AATACAAAAA AAATTACCCA GGCACTCACT CTTGAGGTAA CTAACCAACT CCCAOGATAA 2940 

TGACAGTCCA TTCATGAGOG CAAAGGCCTC ATGACCTAAT GGCACACACC TGTAATCCCA 3000 

ACTGCTTGGG AGGCTGAGGC GAGAGGATTG CTTGAACCTG GGAGGCAGAG GTTGCAGTGA 3060 

GCCGAGATCG CACCACTGCA CTCCAGTCTG GGCAACAGAG TGAGACTTCA TCTCA AAAAA 3120 

AGTAAAAAAA AAGATTTAAT ATAATCACTG AAGATCTCTA TTATAGATAG ATTAGGTTTr 3180 

TGACATTGGA AACATACTTA GGGATAGATT TGTCCTAAAG GAAAAAAGTA GGCCCGGGCA 3240 

GATTAAATGT CTTGTGTAAA GTCACACATT AAATTCAGTC ACACATTAAA TTCATAGAGT 3300 

TTTAAATGTT TAATGTATAT AAACCAGTTT CTTTATACAC ATTTGGGAAA ACATTGGTCT 3360 

CACAGATTAA ATGATTAACT AACTGACCCA GGAACTAGTT GTAGCTTTCT AAGTAATTAG 3420 

GCAATTACAG TTATTGCCTG TAACCAAAGG TAATAAAACA AAATGACAAG TACATGTTTA 3480 

AAATTATGAG GCAATGAGAA ATAATTTAAA AACCAATTTT CTAGTTATAA TTTAAAATTT 3S40 

GGAGAGCATT TTTAACAGTA ATTAATCCAG AGGTGGCTCA AATTGAGTAT A AGAA TTAAG 3600 

ATTATTTAAA ATACTGCATG TCTACCTTCT CGGGGATCAT ACTTTATAAC ACTTTCTGCT 3660 

TCAGTAGCTC TTCATAGCTT GCCAAGTATG CTCCCATATT TTCTCTCTOS T6CCTCGCAA 3720 

ATGAAAGTCA GATAGGCTGG GAACTCATGG GGCAGCCCTC AGACTTCAAT GTGGGCTTCA 3780 

AATCCAGTTT CCTGTTCTAT ATGGTGCTAC ATCTTTCCAG AAAATTTCCC TCAGAGCCCC 3840 

TCGCCAAAAC AAAGCATTAT TTTGACCCTG CATGCTATTT CTTTAGCTGT AGGTGATAGA 3900 

TTAGAACTTC TGTCAGACAT GTTAATGACA AACATACCAA CAGACAATAA CCAAAG CAAA 3960 

TGTTTCCTTC AAGTGTGAAA TGTGCAGGGG CTCGTGGGCA AGQATGTATT GGCACACTGT 4020 

CCTCTTGAAC TGATAGTGTC CCAGCAATGT TGGAGGTTGG CACCATTCCT GGTCCGACAC 4080 

TTGAGGACCT GAGAGACATC AGGTTTAGAA TGAGCCAAAG AAATCCTACA AGATGGGGAG 4140 

AATTGGTGTG CAfiCAGCCTA AGTGTTATAG TTAAGTCTAA AGAAGTATGA AAGATCCCCT 4200 

GTGTTCTCTA AATTGAGCAG AGGGGCCIGC CTACCRATAT CACTTTTTAG GGQACTGAAC 4260 

CATTGCAGGT TAGACTPQGC TTGCAAA6AG TCIGCCTAAG CCAGGGGTGG CAGGGTAGGC 4320 

CATCATAGCT GGATGGCCTC AAAAGCAGAT GGGGGCAGAC TTGCCCTCGT GATGCCAGGA 4380 

TTTGAGAGGC AGAGTTTCTA GAGG6AGACC AGTGCTGCCT CTCACAGTGG CAGTTTTTTC 4440 

TCTTTGCAAG AGGAGGGGCT GTTCAATTCC ATAGACCAGT GGGCAGATAG CCAGTTGAAT 4500 

ACTCTGTGCA TGGTTTGATC CTTTATTAGT TCGCTCTAAT ATTTTTCTGT AGATCCTTTT 4560 

GTCCTGGACT CAAAATCTAA TCCATGCATT GTATGATACC GTAGCT CTCC TAAGGTTTGT 4620 

GTTTCCTTCA AAATGTTTTA GTTTTCTTCA ACTAAATTTG ATTTTTGCTG TTAGAAGT6A 4680 

CATATTTTTA TGGTATACAC TATCTTCCTT TTTTCTACTG OGAGTCAATT TTTTGAATTT 4740 

TGGTOAGAAA GAATATATCT ACAAATTGCA OGAAAGTATC ATAAAAACftG TACTCTAGAG 4800 
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CAGOGCIGTC CAATAGAAAT ATAATCTGAG CCACATGTAT AATTTTATTT TCTTCTAGCC 4860 

ACATTAAAGA AGTAAAAAGA TACAAGTAGA ACTAATTTTA ATX3TTTTAAT TCAGTATATC 4920 

CAAAAXATCA TTTGAACATG TAATTAATAT AAAATTATTA ATGIGATA3T TTACATTCTT 4980 

TTGGTAATAC TAGTCTTCAA AATCTGGTAT GTATCTTACA TTGATAQCAC ATCrCACTTT 5040 

GTACTAGCCA CATTGCAAGT GCTCAGTAGC CACATGTGGC TAGTGGCTAC TGCACTGGAC 5100 

AGCACAGTTC TAGC3TTCCAC CCTAACAOOC AAGTCCTGTG GATTAGAATC CCAGAATCAG SI 60 

AGCTGGAAGT AAACATAGAG ATCAAAOCTC CTTTTAAAAA TGAOGAOGCT GAGGCACAGA 5220 

6TTTAAATGG CTTGCATGAG GTCATACAGC TAAATTCAGC CTCAACAGGG TCTTCTGATT 5280 

CCAGGCACTC tTGCCACTCC ACTACATTAC TGTAGTGGTA ATTCTTA066 TTAAAAAAAG 5340 

TGTACAGTAG GOOGQGOGCA GTGGCTCATG CCIGTAATCC CAGGACrTTG GGAGGCOGAA 5400 

GTGGGCGGAT CAOSAGGTCA GGAGATOGAG ACCATCCTGG CCAACATGGT GAAACCCCGT 5460 

CTCTACTGAA AATAC3U\AGC AAAATTAGOC AGGTGTGGTG GCXXMOGCCT GTGGTCCCAG 5520 

CTGCTCTGGA GGCTGAGGGA 6AATGGGGTG AAaX»GGAG GCAGAGATXX3 CAGTGAGCCA 5580 

AGATGGOGOC ACTGCAGOCC AGOCTGGGOS ACAGAGG6AG ACTOCSITCTC AAAAAAAAAA 5640 

AAAAAAAAAA AAGAAAAGAA AAGAAAAGTC TAGAGAACAT TATATTAAGT GCTTATTATT 5700 

GAAGTAGACC AAAGTTTATA CCATAAGGAT ATTTTTCCTT AAATAC3CATG TTTGAAGAAC 5760 

AATTATTTAT TGATCCTTGA ATCTGTAAGA tCAAATAACA AGTCTCTATC CATGTTACCA 5820 

AATTTAACCT TTTGAAAATA ATAAACTTTA AAATATCAGA TGTGTTATTA CAGGATGATA 58 80 

CTTGGAATCA AGTGAAATGA GTTATATQGT CATCACTAAA TTTAGAAATC TATTGTGAAA 5940 

CAAAGACAAA CAOGAAAGTA CAGAATAGAG ACTTTTAGTA AATAAATGGA ATTTAAAAGA 6000 

AAGTGTTTAT TTACaGTGTC AOGACAGAAA AGGATGTCTT TGTTGrCATA GTCTTTGAGG 6060 

GATCTCOGTA AAATCPG6G6 CACAGGTACA AGAAATAGCC AATATTTAGT TCCCAGACCA 6120 

TGTTTAGTAG TGTCCAGTrr CAGATCATGC TGCCAAGAGG TATCTCCCCC TCAGC?TGGGT 6180 

CATCACTGAG CCCTGGAATT GGAGACTCAT ACTTGCCXaG CACAATGTTA OGGGCAGACA 6240 

GGCCGACATC TATGATTAGC TAGAAGCCAT AAAGAAAAGC TGCTAAG TGG CCACTAGGTG 6300 

CCACTTTTCT GTTTTTGTAA tGClTTCATT AGGAGATCTT TTTTTTCCAA GCTCCATGGG 6360 

GCCTATGAGA GGCATTTATG ATTTTTGTGC CTACAATAAG TCAGCCTGTC TGGTGTGAGT 6420 

TGTTTTATGA GAAATGCTTT CCAAGGGAGG TCTAGGAAGA TCCTGACACA lAAGAACTTT 6480 

GGCTTAGAGA GCTTTCCAGG TGTAGTGCCA ATAAAAACTG ACCTGGAAAG AAAACCTGOC 6540 

CAGCA06GAA CATGCTTTCT GAACTCACTT GAGAGTGTAT GGTGTATGTC ACTTCTCATA 6600 

TATTCTTGAG TTTAGATTTG TCTTTTATAC AATTTTTAGC TCTTTTCCAG TTCACTTGTG 6660 

CTOSTCTGTA TATTGGTATT TTTAAATTTT TGTGGTAAAT AATCAAAAGA GTGAAATTAT 6720 

ATTTTATAAT TACTCATTTG TAGTTTTTTT .TTTTAATTTA ATAAACTTCC TCCAAAAAGT 6780 
GCTCCCTTAA AA 



Seq ID NO: 166 Protein sequence: 
Protein Accession #: AAG34652 

1 11 21 31 41 51 

i i 1 I I I 

MAAEEEAAAG GKVLREENQC lAPWSSRVS PGTRPTAMGS FSSHMTEFPR KRKGSDSDPS 60 
QVEDGEHQVK MKAFRBAHSQ TEKRRRDKMN NLIEELSAMI PQCNFMARKL DKLTVLRMAV 120 
QHIiRSLKGLT NSYVGSNYRP SFLQDMELRH LILKTABGFL FWGCERGKI LFVSKSVSKX 180 
LNYDQASLTG QSLFDFmPK DVAKVKEQLS SFDISPREKL IDAKTGlJQVH SNLHAGRTRV 240 
YSGSRRSFFC RIKSCKXSVK EBHGCLFNSK KXEHRKFYTI HCTGYLRSWP PNIVGMEEER 300 
NSKKDKSHFT CLVAIGRLQP YXVPQNSGEZ NVKPTEFITR FAVNGXFVYV 0QRATAXL6Y 360 
LPQBLLGTSC YEYFKQDDHH NLTDRHKAVL QSKEKILTOS YKFRAKDGSF VTLKSQWFSF 420 
TNPWTKELEY IVSVNTLVLG HSEPGE;iSPIi PCSSQSSEES SRQSCMSVPG MSTGTVLGAG 480 
SIGTDIANEI LDI/QRLOSSS YLDDSSPTGL MKDTHTVNCR SMSNKELPPP SPSEMGELEA 540 
TRQNQSTVAV HSEBPLLSDG AQLOFDALCD NODTAMAAFM NYLBAEGGliG OPGOPSDIQW 600 
TIi 

Seq 10 NO: 167 DNA sequence 
Nucleic Acid Accession NM_014400 
coding sequence: 66-1126 

1 11 21 31 41 51 

1 I i I I I 

GGTTACTCAT CCTGGGCTCA GGTAAGAGGG CCCGAGCTOS GAGGOQGCAC ACCCAGGGGG 
GACGCCAAGG GAGCAGGAGG GAGCCATGGA CCCG6CCAGG AAAGCAGGTG CCCAGGCCAT 
GATCTGGACT GCAGGCTGGC TGCTGCTGCT GCTGCTTCGC GGAGGAGCGC AGGCCCTGGA 
GTGCTACAGC TGCGTGCAGA AAGCAGATGA CGGATGCTCC CCGAACAAGA TGAAGACAGT 
GAAGTG06CG COGGGCGTGG ACGTCTGCAC CGAGGCCGTG GGGGCGGTGG AGACCATCCA 
GGGAGAATTC TC6CTGGCAG TGCSGGGTEG C366TTOGG6A CTCCCOGGCA AGAAIXSACOG 
OGGCCTGGAT CTTCAOGGGC TTCTGGOGTT CATCCAGCT6 CAGCAATGCG CTCAGGATCG 
CTGCAACGCC AAGCTCAACC TCACCTOGCG GGCGCTOSAC CCGGCAGGTA ATGAGAGTGC 
ATACCOGCCC AAOGGCGTGG AGTGCTACAG CTGTGTGGGC CTGAGCCGGG AGGCGTGCCA 
GGGTACATCG CCGCOGGTCG TGAGCTGCTA CAACGCCAGC GATCATGTCT ACAAGGGCTG 
CTTCGACGGC AACGTCACXn' TGACGGCAGC TAATGTGACT GTGTCCTTGC CTGTCOGGGG 
CTGTGTCCAG GATGAATTCT GCACTCGGGA TGGAGTAACA GGCCCAGGGT TCACGCTCAG 
TGGCTCCTGT TGCCAGGGGT CCXX3CTGTAA CTCTGACCTC GGCAACAAGA CCTACTTCTC 
GCCT06AATC GCACCOCTTG TCGGGCTGOC COCTCCAGAG CCCAOGACTG TGGCCTCAAC 
CACATCTGTC ACCACTTCTA CCTCGGCCCC AGTGAGACCC ACATGCACCA CCAAAGCCAT 
GCCAGCGCCA ACCAGTCAGA CTCCGAGACA GGGAGTAGAA CACGAGGCCT CCOSGGATGA 
GGAGCCCAGG TTGACTGGAG GCGCCGCTGG CCACCAGGAC CGCAGCAATT CAGGGCAGTA 
TCCTGCAAAA GGGGGGCCCC AGCA6CCCCA TAATAAAGGC TGTGTGGCTC CCACAGCTGG 
ATTG6CAGCC CTTCTGTTGG CCGTGGCTGC TGGTGTCCXA CTGTGAGCXT CTOCACCTGG 
AAATTTCCCT CTCACCTACT TCTCTGGCOC TGGGTACCCC TCTTCTCATC ACTTCCTGTT 
CCCACCACTG GACTGGGCTG GCCCAGOICC TGTTTTTCCA ACATTCCCCA GTATCCCCAG 
CTTCTGCTCC GCTGGTTTGC GGCTTTGGGA AATAAAATAC CGTTGTATAT ATTCTGGCAG 
GGGTGTTCTA GCTTTTTGAG GACAGCTCCT GTATCCTTCT CATOCTTGTC TCTCOGCTTG 
TCCTCTTGTG ATGTTAGGAC AGAGTGAGAG AAGTCAGCTG TCACGGGGAA GGTGAGAGAG 
AGGATGCTAA GCTTOCTACT CACTTTCTCC TAGCCAGCCT GGACTTTGGA GOGTGGGGTG 
GGTGGGACAA TGGCTCCCCA CTCTAAGCAC TCCCTCCCCT ACTCOCCGCA TCTTrGGGGA 
ATCGGTTCCC CATATGTCTT OCTTACTAGA CTCTGAGCTC CTOGAOGGCA GGGAOCGTGC 
CTTATGTCIG TGTGTGATCA GTTTCTQGCA CATAAATGCC TCAATAAAGA TTTAATTACI 
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TTGTATAGTG AAAAAAAA 



Seq ID NO: 168 Protein sequence: 
Protein Accession ft: NP_055215 

i il 21 31 41 51 

I I I I I.I 

MDPARKAGAQ ANIWTAGWLL LLLLRGGAQA LECYSCVQKA DDGCSPNKMK TVKCAPGVUV 60 

CTEAVGAVET IBGQFSLAVX GGGSGUPGKN DRGLDLHGLL APIQLQQCAQ DROUUCUILT 120 

SRALDPAC21S SAYPPNGVEC YSCVGIjSSEA OQCSTSPPWS CYNASDHVYK GCFDGKVTLT 180 

AANVTVSZiPV RGCVQDEPCT RDGVTGPGFT LSGSCCQGSR aJSDLSKKTY FSPRIPPLVR 240 

LPPPEPTTVA STTSVTTSTS APVRPTSTTK PMPAPTSQTP RQGVEHEASR DEEPRLTQGA 300 
AGUQZIRSMSG QYPAKGGPQQ PRNKGCVAPT AGZiAAIJ<IAV AAGVLL 

Seq ID KO: 169 ONA sequence 
Nucleic Acid Accession ft: NM_00687S 
Coding sequence: 186-1190 

1 11 21 . 31 41 51 

I I 11 I I 

GAATTGGGCA OQAGGGGGOG GOGAATCTCR AOGCTGOGCC GTCTGOGGGC GCTTCOGGGC 60 

CACCAGTTTC TCTGCTTTCC ACCCTGGOGC OCCCCAGCXX TOGCTCCCCA GCTGGGCTGC 120 

CCCGGGCGTC CAOGCCCTGC GGGCTTAGCG GGTTCAGTGG GCTCAATCTG OGCAGOGCCA 180 

CCTCCATGTT GACCAAGCCT CTACAGGGGC CTCCCGCGCC CCCCGGGAOC CCCACGCCGC 240 

OGCCAGGAGG CAACGATOSG GAAGOCSTTOG AGGCOGAGTA TCGACTCGGC CXXXTTCCTGC 300 

GTAAGGGGGG CTTTGGCACC GTCTTCX5CAG GACACCGCCT CACAGATOGA CTCCAGGTGG 360 

CXyVTCAAAGT GATTCCCCXK3 AATCGTGTGC TGGGCTOGTC OCCCTrGTCa GACTCAGTCA 420 

CATGCCCACT CGAAGTCGCA CTGCTATGGA AAGTGGGTGC AGGTGGTGGG CACCCTGGCG 480 

TGATCCGCCT GCTTGACTGG TTTGAGACAC AGGAAGGCTT CATGCTGGTC CTCGAGOGGC 540 

CTTTGCCXEC CCAGGATCTC TTTGACTATA TCACAGAGAA GGGCCCACTG GGTGAAGGCC 600 

CAAGCCGCTG CTTCTTTGGC CAAGTAGTGG CAGCCATCCA GCACTGCCAT TCCOGTGGAG 6G0 

TTGTCCATOG TGACATCAAG GATGAGAACA TCCTGATAGA CCTAOGCCGT GGCTGTGCX» 720 

AACTCATTGA TTTTGGTTCT GGTCCXXTTGC TTCATGATGA ACCCTACACT GAGTTTGATG 780 

GGACAAGG6T GTACAGCCCC OCAGAGTGGA TCTCTCGACA CCAGTAOCAT GCACTGC006 840 

CCACTGTCTG GTCACTGGGC ATCCTCCTCT ATGACATG6T GTGT6GGGAC ATTCCCTTTG 900 

AGAGGGACCA GGAGATTCTG GAAGCTGAGC TCCACTTCCC AGCCCATGTC TCCCCAGACT 960 

GCTGTGCCCT AATCCGCCGG TGCCTGGCCC CCAAACCTTC TTCCCGACCC TCACTGGAAG 1020 

AGATCCTGCT GGACCCCTGG ATGCAAACAC CAGCCGAGGA TGTTACCCCT CAACCCCTCC 1080 

AAAGGAGGCC CTGCCCCTTT GGCXTTOGTCC TTGCTACOCT AAGCCTGGCX: TGGCCTGGCC 1140 

TCGCCCCCAA TGGTCA6AAG AGCCATCCCA TGGCCATGTC ACAGGGATAG ATG6ACATTT 120Q 

GTTGACTTGG TTTTACAGGT CATTACCAGT CATTAAAGTC CAGTATTACT AAGGTAAGGG 1260 

ATTGAGGATC AGGGGTTAGA AGACATAAAC CAAGTTTCCC CAGTTCCCTT CCCAATOCTA 1320 

CAAAGGAGCC TTCCTCCCAG AACCTGTGGT CCCTGATTTT GGAGGGGGAA CTTCTTGCTT 1380 

CTCATTTTGC TAAGGAAGTT TATTTTGGTG AAGTTGTTCC CATTTTGAGC CCCXMGACTC 1440 

TTATTTTGAT GATGTGTCAC CCCACATTGG CACCTCCTAC TACCACCACA CAAACTTAGT 1500 

TCATATGCTT TTACTTGGGC AAGGGTGCTT TCCTTCCAAT ACCCCAGTAG CTTTTATTTT 1560 

AGTAAAG6GA CCCTTTCCCC TAGCCTAGGG TOCCATATTG GGTCAAGCTG CTTACCTGCC 1620 

TCAGCCCAGG ATTTTTTATT TTGGQGGAGG TAATGCX3CTG TTGTTACCCC AAGGCTTCTT 1680 

TTTTTTT T TT TTTTTTTTTG GGTGAGGGGA CCCTACTTTG TTATCCCAAG TGCTCTTATT 1740 

CTGGTGAGAA GAACCTTAAT TCCATAATTT GGGAAGGAAT GGAAGATGGA CACCACCGGA 1800 

CACCACCAGA CAATAGGATG GGATGGATGG TTTTTTGGGG GATGGGCTAG GGGAAATAAG 1860 

GCTrGCTGTT TGTTTTCCTG OGGOGCTCCC TCCAATTTTG CftGATTTTTG CAACCTCCTC 1920 

CTGAGCOGGG ATTGTCCAAT TACTAAAATG TAAATAATCA OGTATTGTGG GGAGGGGAGT 1980 

TCCAAGTGTG OOCTCCTTTT TTTTCCTGOC TGGATTATTT AAAAAGCXIAT GTGTGGAAAC 2040 
CCACTATTTA ATAAAAGTAA TAGAATCAGA AAAAAAAAAA AAAAAAAA 



Seq ID NO: 170 Protein cequence: 
Protein Accession ft: £TP_006866 

1 11 21 31 41 51 

) 1 i i I I 

MLTKPLQGPP APPGTPTPPP GGKDREAFEA EYRLGPLLGK GGPGTVPAGH RLTDRLOVAI 60 
KVIPRNRVL6 WSPLSDSVTC PLEVALLWKV GAGGGHPGVX RLLDWFETQE GFMIiVLERPL 120 
PAC^LFDYIT EKOPLGEGPS RCFFGQWAA ZQHCBSI16W KRDIXDENIL IDLRRGCAKL 180 
IDFGSGALLH DEPYTDFDGT RVYSPPSWIS RHQYBALPAT VHSLGILLYD MVOGDIPFER 240 
DQEILEAELK FPAHVSPDCC ALIRRCtiAFK PSSRPSLBEI ZiLDPWMQTPA EDVTPQPLQR 300 
RPCPFGLVIiA TLSXiAHPGLA PNGQKSHFHA KSQG 

Seq ID NO: 171 DNA sequence 
Nucleic Acid Accession fti HM_O03646 
Coding sequence: 89.. 2875 

1 11 21 31 41 SI 

I I i ) i i 

GCGGCGCGGA GOGGGCGTCC TGAGCCCCGG CCGCCGGCCC GGCATGGGCG TCTCCCGCGG 60 

GCCCTC060C GGC06GGGCT AGGGCCGGAT GGAGCCGCGG GAOGGTAGCC CCGAGGCCCG 120 

GAGCA6CGAC TCCOAGTOGG CTTCOGCCTC GTCCAGOQGC TCCGAGC3G06 ACGCCGGTCC 180 

CGAGCCGGAC AAGGCGCCGC GGOGACTCAA CAAGCQGCGC TTCCCGGGGC TGCGGCTCTT 240 

CGGGCACAGG AAAGCCATCA GCAAGTOGGG CCTOCAGCAC CTGGCCCCCC CTCCGCCCAC 300 

CCCTGGGGCC CCGTGCAGCG AGTCAGAGCG GCAGATCCGG AGTACAGTGG ACTGGAGOGA 360 

GTCAGCGACA TATGGGGAGC ACATCTGGTT CGAGACCAAC GTGTCOGGGG ACTTCTGCTA 420 

CGTTGGGGAG CAGTACTGTG TAGCCAGGAT GCTGAAGTCA GTGTCTCGAA GAAAGTGOGC 480 

AGCCTGCAAG ATTGTGGTGC ACACGCCCTG CATCGAGCAG CTGGAGAAGA TAAATTTCCG 540 

CTGTAAGCGG TCCTTCCGTG AATCAGGCTC CAG6AATGTC OGCGAGCCAA CCTTTGTA06 600 

GCACCACTGG GTACACAGAC GAOSOCAGGA C30GCAAGTGT 0GGCACTGT6 GGAA6GGATT 660 

CCAGCAGAAG TTCACCTTCC ACAGCAAGGA GAtTGTGGCC ATCftGCTGCT CGTGGT6CAA 720 
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GCAG601TAC CACA0CAAG6 I V l XX T^ bCTf OITGCIGCAG CAGATOGAGG AGCGGIGCTC 780 

GCTGGGGCTC CAOGCAGCCG TGGTCATCCC GCOCaCCTOG ATCCTCOGOG CCOGGAOSCC 840 

CCAGAATACT CTGAAAGCAA GCAAGAAGAA GAAGAGGGCA TCCTTCAAGA GGAAGTCCAG 900 

CAAGAAAGGG CCTGAGGAGG GCCGCTGGAG ACXXTTTCATC ATCAGGCXX3V CCCOCTCCOC . 960 

GCTCATGAAG CCCXrTGCTGG TOTTTGrGAA CCCCPMSMTT GC3GGGCAACC AGGGTGCAAA 1020 

GATCATCCAG TCTTTCCTCT GGTATCTOU TCCXXXSACAA GTCTTCX3ACC TGAGCCACGG 1080 

AGGGCCCAAG GAGGGGCTGG AGATGTACC6 CAAAGTGCAC AAOCTGOGGA TCCTGGOGTG 1X40 

GGGGGGOGAC GGCAOGGTGG GCTGGATCCT CTOCAOOCIG GftOCACCIAC GGCTGAAGOC 1200 

GCCACOCCCr GTTGOCATCC TQCOCCTGGG TACPQGCAAC GACTTQGCOC GAACCXTTCAA 1260 

CTGGGGTGGG GGCTACACAG ATGAGCCTGT GTOCAAGATC CTCTC(XAOG TGGAGGAGGG 1320 

GAAOGTGGTA CAGCTGGACC GCTGGGACCT CCACGCTGAG CCCAACCCCG AGGCAGGGCC 1380 

TGAGGACCCA GATGAAGGCG OCACOGACGG GTTGOCCCTG GATGTCTTCA ACAACTACTT 1440 

CAGCCTGGGC TTTGAOGCCC AOGTCACCCT GGAGTTCCAC GASTCXOGAG AGGCCAACCC 1500 

M3AGAAATTC AACAGCCGCT TTCXSGAATAA GATGTTCTAC GCOGGGACAG CTTTCTCTGA ISfiO 

CTTCCTGATG GGCAGCTCCA AGGACCTGGC CAAGCACATC CGAGTGGTGT GTGATGGAAT 1620 

GGACTTGACT COCAACATCC AGGACCTGAA ACCCCAGTGT GTTGTTTTCC TGAACATCCC 1680 

CAGGTACTGT GCGGGCACCA TGCCCTGGGG CCAC C CT G GG GAGCACCAOG ACTTTGACCC 1740 

CCAGOGGCAT GACGACGGCT ACCTCGAGGT CATTGGCTTC ACCATGACGT CGTTGGCOGC 1800 

GCTGCAGGTG GGCGGACACG GCX^AGCGGCT GAOGCAGTGT OGOGAGGTGG TGCTCACCAC 1860 

ATCCAAGGCC ATCCGGGTGC AGGTGGATG6 OGAGCCCTGC AAGCTTGCAG CCTCAOGCAT 1920 

GCGCATOGCC CXGOSCAACC AG6CCAOCAT GGTGCA6AAG G0CAAG0G6C GGAGQGCOGC 1980 

CCCC C TGCAC AGOGACCAGC AGC0GGT6CC AGAGCAGTTG OGCATCCAGG TGAGTG60GT 2040 

CAGCATGCAC GACTATGAGG CCCPGCACTA OGACAAGGAG CAGCTCAAGG AGGCCTCTGT 2100 

GCOGCTGGGC ACTGTGGTGG TCCCAGGAGA CAGTGACCTA GAGCTCTGCC GTGCCCACAT 2160 

TGAGAGACTC CAGCAGGAGC CXZGATGGTGC TGGAGCCAAG TCCCGGACAT GCCAGAAACT 2220 

GTOCC0CAA6 TGGTGCTTCC TOGACGCCAC CACTGGCAGC G6CTTCTACA GGATGGACC6 2280 

AGCCCAGGAG CACCTCAACT ATGTGACTGA GATOGCACAG GATGAGATTT ATATCCTGGA 2340 

CCXrrGAGCTG CTGOGGGCAT OGGCCOCSGCC TGACXTTOCCA ACCCCCACTT CCCCTCTCCiC 2400 

CACCTCaCCC TGCTCACCCA OGCCCCGGTC ACTGCAAGGG GATGCTGCAC CCCCTCAAGG 2460 

TGAAGAGCTG ATTGAGGCTG CCAAGAGGAA CGACTTCTGT AAGCTCCAGG AGCTGCACCG 2520 

AGCTGGGGGC GACCTCATGC ACCGAGACGA GCAGAGTCXSC ACGCTCCTGC ACCACGCAGT 2580 

CAGCACTGGC AGCAAGGATG TGGTCCGCTA CCTGCTGGAC CAOGCCCCCC CAGAGATCCT 2640 

TGATGCGGTG GAGGAAAACG GGGAGACCTG TTTGCACCAA GCAGCGGCCC TGGGCCAGCG 2700 

CACGATGTGC GACTACA7G6 TG6A0600GG GGGCTOSCTC AIGAAGACAG ACCAGCA06G 2760 

CGACACTCCC CGGCA60GG6 CTGAGAAG6C TCAGGACACC GAGCTGGCCG CCTACCTGGA 2820 

GAACCGGCAG CACTACCAGA TGATCCAGCG GGAGGACCAG GAGACGGCTG TGTAGGGGGC 2880 

Seq ID NO: 172 Protein sequence: 
Protein Accession fit KP_003637 

1 11 21 31 41 51 

I I i i 1 1 

MEPRDGSPCA RSSDSBSASA SSSGSERDAG PEPDKAPRHL NKRRFPGLRL FGHRKAITKS 60 

GliQHLAPPPP TPGAPCSESE RQIRSTVDWS ESATYGEHIW PETNVSGDFC YVGEQYCVAR 120 

MUCSVSRRKC AACKIWHTP CIEQLEKINF RCKPSFRESG SRNVREPTFV RHHWVHRRRQ 180 

D6KCRKGGKG FQQKFTFHSK BIVAISCSWC KQAYHSKVSC FMZiQQIEEPC SLGVHAAWI 240 

PpmilAARR PQUTLKASKK KKRASFRRKS SKKGPEEGRtl RPFIIStPTPS PLHKPLLVFV 300 

NPKSGGNQGA KIIQSPLHYL NPKQVFOLSQ G6PKEALEMY RKVHNZiRILA CGGOGTVGWI 360 

LSTLDQLRLK PPPPVAILPL GTGNDLARTL NWGGGYTDEP VSKILSHVEE GNWQLDRWD 420 

LHAEPNPEAG PEDRDEGATD RLPLDVFNNY PSLGFDAHVT LEFHESREAN PEKFNSRFRN 480 

KMPYAGTAFS DFLMGSSKDL AKHIRWCDG MDLTPKIQDL KPQCWFLNI PRYCAGTMPW 540 

GHP6ERHDFB PQRHDDGYLB VIGFTMTSIA ALQVGOIGER LTQCREWIiT TSKAZPVQVD 600 

GBPCKIAASR IRIAItRMQAT HVQKAKRRSA APLHSDQQPV PEQLRIQVSR VSMHDYEALH 660 

YDKEQLKEAS VPLGTWVPG DSDLELCRAH lERLQQEPDG AGAKSPTCQK LSPKMCFLDA 720 

TTASRFYRID RAQEHLNYVT EIAQDEIYIL DPELLGASAR PDLPTPTSPL PTSPCSPTPR 780 

SLQGDAAPPQ GEELIEAAKR NDFCKIiQELH RAG6DLMHRD BQSRTIiLKKA VSTGSKDWR 840 

YIiLOilAPPEI LDAVSEHGBT CLHQAAALGQ RTICHYXVEA GASLMKTDQQ GDTPRQRAEK 900 
AQOTELAAYL QIRC2HYQMIQ REDQETAV 

Seq ID NO: 173 DlfA sequence 
Nucleic Acid Accession ft: AF232772 
Coding sequence: 1>1662 

1 11 21 31 41 51 

i i 1 ) 1 I 

AT6G0GGTGC A6CTGAOSAC AGCCCTGOGT GTGGTGGGCA CCACCCTGTT TGCCCTGGCA 60 

GTGCTGGGTG GCATCCTGGC AGCCTATGTG ACGGGCTACC AGTTCATCCA CACGGAAAAG 120 

CACTACCTGT CCTTOGGCCT GTACGGCGCC ATCCTGGGCC TGCACCTGCT CATTCAGAGC 180 

CTTTTTGCCT TCCTGGAGCA COGGCGCATG OGAOGTGCCG GCCAGGCCCT GAAGCTGCCC 240 

TCCCCGCGGC GGGGCTCGGT GGCACTGTGC ATTGCGGCAT ACCAGGAGGA CCCTGACTAC 300 

TTGCX5CAAGT GCCTGCGCTC GGCCCAGCGC ATCTCCTTCC CTGACCTCAA GGTGGTCATG 360 

GTGGTGGATG GCAAC06CCA GGAGGACGCC TACATGCTGG ACATCTTCCA GGAGGTGCTG ' 420 

GGOGGCACOO AGCAGGCOGG CTTCTTTGTG TGGOGCAGCA ACTTCCATGA GGCAGGOGAG 480 

GGTGA6A0GG AGGCCAGCCT GCAGGAGGGC AT6GAC0GTG TG06GGATGT GGTGC6GGCC 540 

AGCACCTTCr CGTGCATCAT GCAGAAGTGG GGAGGCAAGC GOGAGGTCAT GTACACGGCC 600 

TTCAAGGCCC TCGGCGATTC GGTGGACTAC ATCCAGGTGT GOGACTCTGA CACTGTGCTG 660 

GATCCAGCCT GCACCATOGA GATGCTTCGA GTCCTGGAGG AGGATCCCCA AGTAGGGGGA 720 

GTOSGGGGAG ATGTCCAGAT CCTCAACAAG TAOSACTCAT OGATTTCCTT OCTQAGCAGC 780 

GTGOGGTACT GGATGGCCTT CAACGTGGAG 0GG6CCTG0C AGTCCTACTT TG6CTGTGTG 840 

CAGTGTATTA GTGGGCCCTT GGGCAT6TAC OGCAACAGCC TCCTCCAGCA GTTCCTGGAG 900 

GACTGGTAOC ATCAGAAGTT CCTAGGCAGC AAGTGCAGCT TCGGGGATGA CCGGCACCTC 960 

ACCAACXX5AG TCCTGAGCCT TGGCTACCGA ACTAAGTATA CCGCGCGCTC CAAGTGCCTC 1020 

ACAGAGACCC CCACTAAGTA CCTCCX3GTGG CTCAACCAGC AAACCCGCTG GAGCAAGTCT 1080 

TACTTCOGGG AGTGGCTCTA CAACTCTCTG TGGTTCCATA AGCACCACCT CTGGATGACC 1140 

TAOSAGTCAG TGGTCACGGG TTTCTTOCOC TTCTTOCTCA TTGCCAOGGT TATACAGCTT 1200 

TTCTAOOGOG CGOGCATCTG QAACATTCTC CTCTTOCTG C TGACGGTGGA GCTCGTGG6C 1260 

ATTATGAAOG CCACCTACGC CXGCTTOCTT 06GGGCAATG CAGAGATGAT CTTGATGTCC 1320 
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CTCTACTCCC TCCTCTATAT GTCCAGCCTT CTGCXGGCCA AGATCTTTGC CATTGCTACC 
ATCAACAAAT CTGGCTQGGG CACCTCTQGC OGAAAAACCA TrGTGGTGAA CTTCATTGGC 
CTCATTOCTG TGTCCATCTG GGTGGCAGTT CTCCTGGAGG GGCIGGCCTA CACA GCTT AT 
TGOCAGGACC TGTTCAGTGA 6ACAGAGCTA GCCTTCCTTG .TCTCTGG(3GC TATACTGTAT 
GGCTGCTACT GGGTGGCCCT OCTCATGCTA TATCTGGCCA TCATOGCCOG GOGATGTGGG ' 
AAGAAGOOGG AGCAGTACAG CTTGGCTTTT GCTGAGGTGT GACATGGCCC CCAAGCAGAG 
CGGGTAAAGT GCAATGGGTA AGGGAGGGAA GGGGAATGGA AGAGAAAAGA CAGGGTGGGA 
OGGACSGAGGG AGTGCTGrGT TTTftGfrCTCT TAATGGTOCA ARGGACAAAT CTAAAATGCA 
AAGAACX3GTG ATGTAGTATG GCCTGACAGC TCTGTTTAGA GGAGGCAACA CTGATCCCCC 
AGATCCAGGG CTCCAGGGGA TrCTGTGTTT TCAGACTGCC TGTCTGCTTG CATCTGCACA 
TAGGCAGTAG CCTCCTCCTG GGCTCCAGAG OGCACTCAGA AGTTGTGCTA AACCAAGTTA 
AGTCCCATTC AGTGGCAACT TGTGATAGGT ACCTGAGTGA OGGCAACCTG CGGAAGGAGG 
TTCTCCCAGC CCATCTCAAC ACAACCAGAG GTGGCAGGAG AATTTCTACT GAGOGAGGTG 
GGCCGGTTAG TGTATGTCAC CCCCACGCCA CCCATAAGTA GTCATOUITG CAATAAGATT 
GOGCGTGAGA TACAAGGCCC AGAAGCCTGA TCTTTGGGCA TCAGAAAACA GGGTOCAGGA 
ATGGTGCTTT ATGTGAGATA CXSXACTCCA CATCAACATT CCAGGGATGA GCCAAACCAG 
CAGG6AGTTA GCACTGAACT GCTTTTAAAA GTGCACATTA AAAAGGAAAG TT TGCC AGGA 
GGAACAAAGA GATTCTGGTG GTGCTAAAGG AGGCCATAAG CTACACAGAG GCCTrGGGTG 
TTCCACCTGG AAACTGCTCA GACGTCTAGA TGCGTTCTTA GCTTGTCTGT GATCTCTGCT 
GGGGAGATAA AAAGATTAAG CCCCAACATG TTCAGAAAAG AAGTGAAGTC TTGGGTATTT 
TAACCTGTAT ACTCTTGAAT TCCTCTCAAA TTCAGCTCTG ATCTGAGGCT AAGACACACT 
CCCCACTTCA CTTTCTTCAA AGCCACATTT TTTGAGGTAT CACTGCAGTC ACCTCTTCTA 
CCCTCATCAT CATAGGTAAG GTTTTCAAGG TGGCAATTGG GGCGGAGCCC CGGCTTCTTA 
TAGAAGCTTC AGCAGGAGGC AAGCGTGTTC TCAGCACATA TGGGAACTAT GAG GAGO CTC 
TCATCAAATT GGCTACAATC TTGGAGCTGC TTGGACGGAT TCCTTGGCAG CCGG GTTAGC 
ATGTGTGACT TTCAGGCTAC TGTTCTTGAC AATCATCTCC AATGGAAAGC TTTTCAGTGT 
TCCCAAAGT6 AACTCTCAAA TCCAAAATGG TTATCTTTGA GACCATCCAT TCTCCTCAGT 
GGCTTCTCCA GGGAATTCTT ACAGCCAAGT TGTGACAGTC ACTGCATTTG CCTGCTTCTT 
TCCAGAAACC AAACTAGGAG ATGAAACTGG TTCCTACATC CTA AGGT TCT TGCTTTCTCT 
CTCATGCCTC CTGAGGCTGT TTTTG6CTGT TTTCCCTCTG CTGCTTTTGG GGAATGAGOG 
GAAGCCATTT TCCAAGTGAC TTGCAATCCA GGCTGTTCTC AGCGTTTTGA GTTTAAAACC 
TGGGATCCTG ACTAAGCCTT TGACTTAAGG GTTGCTTGCT TGCCCTCCAA ATGTCCTTTC 
TCAAAGGGGC CAACTAACCC GTGCAGAACC AGCACTAAGG TGGACAGCAG ACAAGAGGGC 
AAGCCTCTAA TGTACCAAGT GCTTCCTACA AAGACGCAAG GTGTGCTCCG AACCACAGAT 
GGGCAAACCC TGGTGCTTTC CTTCATCTCC CAGGAACTCA AGGGTTTTCC AAGTGTAGCT 
AACAGTTGCC 'ACATCAavCA GACCTCCAGT TTCTGGTAAG ACTGCTGGTT GACATCAGAC 
CCAACCCATT GAAGGCTGGA AGGCAGCAGG CATTTGCTAA GGCAGCTGAT CCAGGCAATC 
GTTCTGCTGG CCAAGAAGTT AAACTATTTT GAGCATTAGA ATGGAGGAAA TCCX3GTCAGC 
CAWSTGCAGA GTTCAGACTT CGCTAAGGGC TTGTTTTTCT TCAGCATTTA CTTGAAGATT 
AATGTAGGAT GACAGGCTCT CCTGGCTGTC CTACCATCAG CTCTGCCTTG CACTGTGGTC 
GTCAACTTTC CTCAAATCAA AAACAGGCAG GTACAGGTAG TGGGCTCACA ACGTTTGACC 
TCGACTGGTT TTTCTAAGTr ATTTTGTACA TTTTTCAGCA GCAAAAOCAA A CTGGG TCTT 
CAGCTTTATC CCCOTTTCTT GCAAGGGAAG AGCCTTTATA CAATTGGACG CATTTTGGTT 
TTTCCTCATT GAGAATTCAA ATCCTCTTTT GTATTGTTTC TACAATAATT TGTAAACATA 
TTTATTTTTA CCTGCTTTTT tTTTTTTTTT TAATTTTCAG GTCAAGTTTT TTATACTGCa 
CTTATTTGTC AAAATAAAGA TTCTCACAT 

Seq ZD NO: 174 Protein sequence: 
Protein Accession #: AAF36984 
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MPVQLTTALR 
LFAFLEHRRM 
WDCaiRQEDA 
STFSCIMQKH 
VGGDVQILNK 
DWYHQKFLGS 
YFREIVLYNSL 
IIKATYACFL 
LIPVSIWVAV 
KKPBQYSLAF 



11 

1 

WGTSLFALA 
RRAGQALKLP 
YHLDIFHEVL 
GGKREVMYTA 
VDSWISFLSS 
KCSFGDDRHL 
WFHKHHLWMT 
RGNAEMIPMS 
LLEGtAYTAY 
AEV 



21 

I 

VLGGILAAYV 
SPSRGSVALC 
GGTEQAGFFV 
FKALGDSVDY 
VRYWMAFNVE 
TNRVLSLGYR 
YESWTGFFP 
LYSLLYMSSL 
CQDLFSETEL 



31 



TGYQFIKTEK 
lAAYQEDPDY 



IQVCDSOTVL 
RACQSYFGCV 
TKYTARSKCL 
FFLIATVIQL 
LPAKIFAIAT 
AFLVSGAILY 



41 
I 

HYLSFGLYGA 
LRKCLRSAQR 
GETEASLQEG 
DPACTIEMLR 
QClSGPIiGMY 
TETPTKYLRW 
FYRGRIWNIL 
ZHKSGHGTSG 
GCYWVALLML 
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1 

ILGLHLLIQS 
ISPPOLKWM 
MDRVROWRA 
VLEEDPQVGG 
RNSLLQQFLE 
LNQQTRWSKS 
LFLLTVQLVG 
RKTIWNFIG 
YLAZZARROG 



1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
378Q 
3840 
3900 
3960 
4020 



60 
120 
180 
240 
300 
360 
420 
460 
540 



Seq ID NO: 175 DNA sequence 
Nucleic Acid Accession NM_000691 
Coding sequence : 43 . . 1404 
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CCAGGAGCCC 
GCCGTGAAGC 
TTCCAGCAGC 
GCGCTGGCCG 
GTCCTAGAGG 
GTGGAGAAGA 
GTGGTCCTCG 
GGCGCCATCG 
GCGAGCCTGC 
AATGGGGGTG 
AOGGGCAGCA 
GTCACGCTGG 
GTGGCCTGCC 
CCAGACTACA 
TCACTGAAAG 
AGTGCCGGGC 
GGCACGGGGG 



11 
1 

CAGTTACCGG 
GCGCCOGCGC 
TGGAGGGGCT 
CAGACCTGCA 
AGATCGAGTA 
CGCCCCAGAC 
TCATTGGCAC 
CTGCAGGGAA 
TGGCTACCAT 
TCCCTGAGAC 
OGGGGGTGGG 
AGCTGGGAGG 
GAOGCATCGC 
TCCTCTGTGA 
AGTTCTACGG 
ACTTCCAGAG 
ATGCG6CCAC 



21 

] 

GAGAGGCTGT 
CXSCCTTCAGC 
GCAG06CCTG 
CAAGAAT6AA 
CATGATCCAG 
TCAGCAGGAC 
CTGGAACTAC 
CGCAGTGGTC 
CATCX:CCCAG 
CACGGAGCTG 
GAAGATCATC 
GAAGAGTCCC 
CTGGGGGAAA 
CCCCTOSATC 
G6AAGATGCT 
GGTGATGGGC 
TC3GCTACATA 



31 

I 

GTCAAAGGCG 
TCGGGCAGGA 
ATCCAGGAGC 
TGGAAGGCCT 
AAGCTCCCTG 
GAGCTCTACA 
CCCTTCAACC 
CTCAAGCCCT 
TACCTGGACA 
CTCAAGGAGA 
ATGAOQGCTG 
TGCTAOGTGG 
TTCATGAACA 
CAGAACCAAA 
AAGAAATCCC 
CTGATTGAGG 
GCCCCCACX31 



41 

1 

CaVTGAGCAA 
CCGGTCCGCT 
AGGAGCAGGA 
ACTATGAGGA 
AGTGGGCCGC 
TCCACTCGGA 
TCACCATCCA 
CGGAGCTGAG 
AGGATCTGTA 
GGTTCXsACCA 
CT6CCAAGCA 
ACAAGAACTG 
GTGGCCAGAC 
TTGTGGAGAA 
GGGACTATGG 
GCCAGAAGGT 
TOCTCAOGGA 



51 

1 

GATCAGCGAG 
GCAGTTCCGA 
GCTGGTGGGC 
GGTGGTGTAC 
GGATGAGCXX; 
QCCACTGGGC 
GCCCATG6TG 
TGAGAACATG 
CCCAGTAATC 
TATCCTGTAC 
CCTGACCCCT 
TGACCTGGAC 
CTGCGTGGCC 
GCTCAAGAAG 
AAGAATCATT 
GGCTTATGG6 
GGTG6ACCCC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
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CAGTCCCCGG TGATGCAAGA GGACATCTTC GGGGCTGTGC TGOCCATOCST GTGC6TG0GC 1080 

AGCCTGGAGG AOSCCATCCA GTTCATCAAC CAGOGTGAGA AtSCCCCTOGC CCTCTACATG 1140 

TTCTCCAGCA ACGACAAGGT GATTAAGAAC ATGATTGCAG AGACATCCAG TGGTGGQGTG 1200 

GGOGCCAAOG ATGTCATOGT CCACATCACC TTGCACTCTC TGCCCTTCGG GGGCGTGGGG 1260 

AACAGOGGCA TGGGATCCTA CCKTQSCMG AAGAGCTT06 AGACTTTCTC TCROO GCOGC 1320 

TCTTCOCTOG TGAGGCCTCT GATGAATGAT CAAGGGCTGA AGGTCAGATA CCOOCCGAGC 1380 

COGGCCAAGA TGACCCAGCA CTGAGGAGGG GTTGCTCOGC CTGGCCTGGC CATACTGTGT 1440 

CCCATOGGAG TGCGGACCAC CCTCACTGGC TCTOCTGGCC CTGGAGAATC GCTCCTGCAG ISOO 

CCCX»C3CCCA GCCCCACTCC TCTGCTGACX: TGCTGACXTC TGCACACCCC ACTCCCACAT 1560 

GGGGCCAGGC CTCACCATTC CAAGTCTCOV CCCCTTTCTA GACCAATAAA GAGACAAATA 1620 
CAATTTTCTA ACTOGG 

Seq ID NO: 176 Protein sequence: 
Protein Accession 8; NP_000682 

1 11 21 31 41 51 

isKISEAVKR ARAAFSSGRT RPLQPRFQQIi EALQaLIQEQ BQBLVGMAA DLHKNEWNAY 60 

YEEWYVLBE IBYMIQiCLPE WAADEPVEKT PQTQQDBLYI HSEPLGWLV IGTWNYPFNL 120 

TIQPMVGAIA AGNAWLKPS ELSENMASLL ATIIPQYLDK DLYPVIKGGV PETTELLKER 180 

FDHILYTCST GVGKIIMTAA AKHLTPVTLE LGGKSPCYVD KUCDIJWACR RIAWGKFKNS 240 

GQTCVAPDYI LCDPSIQNQI VEKLKKSLKE PYGEDAKKSU DYGRIISARH FQRVKGLIEG 300 

QKVAYGGTGD AATRYIAPTI LTOVDPQSPV MQEEIPGPVL PIVCVRSI.EE AIQPINQREK 360 

PLALYMFSSN DKVIKKMIAB TSSGGVAAND VIVHITLHSI. PFGGVGNSGM GSYHGKKSFB 420 
TFSHRRSCLV RPLMNDEGLK VRYPPSPAKM TQH 

Seq ID NOt 177 DSA sequence 

Nucleic Acid Accession 2m_001067.X 

coding sequence: 3.08-4703 

1 11 21 31 41 51 

CTAACCX3ACG CGCGTCTGTG GAGAAGCGGC TTGGTCGGGG GTGGTCTCGT GGGGTCCTGC 60 

CTGTITAGTC GCTTTCAGGG TTCTTGAGCC CCTTCACGAC CGTCACCATG GAAGTGTCAC 120 

CATTGCAGCC TGTAAATGAA AATATGCAAG TCAACAAAAT AAAGAAAAAT GAAGATGCTA 180 

AGAAAAGACT GTCTGTTGAA ACAATCTATC AAAAGAAAAC ACAATTGGAA CATATTTTGC 240 

TCCGCCCAGA CACCTACATT QGTTCT6TGG AATTAGTGAC CCAGCAAATG TGGGTTTACG 300 

ATGAAGATGT TGGCATTAAC TATAGGGAAG TCACTTTTGT TCCTGGTTTG TACAAAATCT 360 

TTGATGAGAT TCTAGTTAAT GCTGCGGACA ACAAACAAAG GOAOCCAAAA ATGTCTTGTA 420 

TTAGAGTCAC AATTGATCCG GAAAACAATT TAATTAGTAT ATGGAATAAT GGAAAAGGTA 480 

TTCCTGTTGT TGAACACAAA GTTGAAAAGA TGTATGTCCC AGCTCTCATA TTTGGACAGC 540 

TCCTAACTTC TAGTAACTAT GATGATGATG AAAAGAAAGT GACAGGTGGT CGAAATGGCT 600 

ATGGAGCCAA ATTGTCTAAC ATATTCAGTA CCAAATTTAC TGTGGAAACA GCCAGTAGAG 660 

AATACAAGAA AATGTTCAAA CAGACATGGA TGGATAATAT GGGAAGAOCT GGT6AGATGG 720 

AACTCAAGCC CTTCAATGGA GAAGATTATA CATGTATCAC CTTTCAGCCT GATTTGTCTA 780 

AGTTTAAAAT GCAAAGCCTG GACAAAGATA TTGTTGCACT AATGGTCAGA AGAGCATATG 840 

ATATTGCTCG ATCCACCAAA GATGTCAAAG TCTTTCTTAA TGGAAATAAA CTGCCAGTAA 900 

AAGGATTTCX3 TAGTTATGTG GACATGTATT TGAAGGACAA GTTGGATGAA ACTGGTAACT 960 

CCrCGAAAGT AATACATGAA CaAGTAAACC ACAGGTGGGA AGTGTGTTTA ACTATGAGTG 1020 

AAAAAGGCTT TCAGCAAATT AGCTTTGTCA ACAGCATTGC TACATCCAAG GGTGGCAGAC 1080 

ATGTTGATTA TGTAGCTGAT CAGATTGTGA CTAAACTTGT TGATGTTGTG AAGAAGAAGA 1140 

ACAAGGGTGG TGTTGCAGTA AAAGCACATC AGGTGAAAAA TCACATGTGG ATTTTTGTAA 1200 

ATGCCTTAAT TGAAAACCCA ACCTTTGACT CTCA6ACAAA A6AAAACATG ACTTTACAAC 1260 

CCAAGAGCTT TGGATCAACA TGCCAATTGA GTGAAAAATT TATCAAAGCT GCCATTGGCT 1320 

GTGGTATTGT AGAAAGCATA CTAAACTGGG TGAAGTTTAA GGCCCAAGTC CAGTTAAACA 1380 

AGAAGTGTTC AGCTGTAAAA CATAATAGAA TCAAGGGAAT TCCCAAACTC GATGATGCCA 1440 

ATGATGCAGG GGGOOQAAAC TCCACTGAGT GTACGCTTAT CCTGACTGAG GG AGATTC AG 1500 

CCAAAACTTT GGCTGTTTCA GGCCTTGGTG TGGTTGGGAG A6ACAAATAT GGGGTTTTCC 1560 

CTCTTAGAGG AAAAATACTC AATGTTCGAG AAGCTTCTCA TAAGCACATC ATGGAAAATG 1620 

CTGAGATTAA CAATATCATC AAGATTGTGG GTCTTCAGTA CAAGAAAAAC TATGAAGATG 1680 

AAGATTCATT GAAGACGCTT CGTTATGGGA AGATAATGAT TATGACAGAT CAGGACCAAG 1740 

ATQGTTCCCA CATCAAAGGC TTGCTGATTA ATTTTATCCA TCACAACTGG CCCTCTCTTC 1800 

TGOGACATCG TTTTCrGGAG GAATTTATCA CTCCCATTGT AAAGGTATCT AAAAACAAGC 1860 

AAGAAATGGC ATTTTACAGC CTTCCTGAAT TTGAAGA6TG GAAGAGTTCT ACTCCAAATC 1920 

ATAAAAAATG GAAAGTCAAA TATTACAAAG GTTTGGGCAC CAGCACATCA AAGGAAGCTA 1980 

AAGAATACTT TGCAGATATG AAAAGACATC GTATCCAGTT CAAATATTCT GGTCCTGAAG 2040 

ATGATGCTGC TATCAGCCTG GCCTTTAGCA AAAAACAGAT AGATGATGGA AAGGAATGGT 2100 

TAACTAATTT CATCGAGGAT AGAAGACAAC GAAAGTTACT TGGGCTTCCT GAGGATTACT 2160 

TCTATGGACA AACTACCACA TATCTGACAT ATAATGACTT CATCAACAAG GAACTTATCT 2220 

TGTTCTCAAA TTCTGATAAC GRGRGATCTA TCCCTTCTAT GGTGGATGGT TTGAA ACXaC 2280 

GTCAGAGAAA GGTTTTGTTT ACTTGCTTCA AAC3GGAATGA CAAGCGAGAA GTAAAGGTTG 2340 

CCCAATTAGC TGGATCAGTG GCTGAAATGT CTTCTTATCA TCATGGTGAG ATGTCACTAA 2400 

TGATGACCAT TATCAATTTG GCTCAGAATT TTGTGGGTAG CAATAATCTA AACCTCTTGC 2460 

AGCCCATTGG TCAGTTTGGT ACCAGGCTAC ATGGTGGCAA GGATTCTGCT AGTCCACGAT 2520 

ACATCTTTAC AATGCTCAGC TCTTTGGCTC QATTGTTATT TCCACCAAAA GATGATCACA 2580 

CGTTCAA6TT TTTATATGAT GACAACCAGC 6TGTTGAGCC T6AATGGTAC ATTCCTATTA 2640 

TTCCCATGGT GCTGATAAAT GGTCCTGAAG GAATCGGTAC TGGGTGGTCC T6CAAAATCC 2700 

CCAACTTTGA TGTGCX3TGAA ATTGTAAATA ACATCAGGCG TTTGATGGAT GGAGAAGAAC 2760 

CTTTGCCAAT GCTTCCAAGT TACAAGAACT TCAAGGGTAC TATTGAAGAA CTGGCTCCAA 2820 

ATCAATATGT GATTAGTGGT GAAGTAGCTA TTCTTAATTC TACAACCATT GAAATCTCAG 2880 

AGCTTCCCGT CAGAACATGG ACCCAGACAT ACAAAGAACA AGTTCTAGAA CCCATGTTGA 2940 

ATGGCACCGA GAAGACACCT CCTCTCATAA CACACTATAG GGAATACCAT ACAGATACCA 3000 

CTGTGAAATT TGTTGTGAAG ATGACTGAAG AAAAACTGGC AGAQGCAGAG AGAGTTG GAC 3060 

TACACAAAGT CTTCAAACTC CAAACTAGTC TCACATGCAA CTCTATGGTG CTTTTTGACC 3120 

ACGTAGGCTC TTTAAAGAAA TATGACACGG TGTTGGATAT TCTAAGAGAC TTTTTTGAAC 3180 

TCAGACTTAA ATATTATGGA TTAAGAAAAG AATGGCTCCT AGGAATGCTT GGTGCTGAAT 3240 

CTGCTAAACT GAATAATCAG GCTCGCTTTA TCTTAGAGAA AATAGATGGC AAAATAATCA 3300 
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TTGAAAATAA GCCTAAGAAft GAATTAATTA AAGTTCTGAT TCAGAGGGGA TAT6ATTGQ6 3360 

ATCCTGTGAA GGCCTGGAAA GAAGOCCAGC AAAAGGTTCC AGATGAAGAA GAAAAIGAAG 3420 

AGAGTGACAA CGAAAAGGAA ACTGAAAAGA CTGACTCCGT AACAGATTCT GGACCAACC7 3480 

TCAACTATCT TCTTGATAaC CCOCTTIGGT ATTTAACCAA GGAAAAGAAA GATGAACTCT 3S40 

GCAOGCtAAG AAA3GAAAAA GAACAAGAGC TQGACACATT AAAAAGAAAG A0TCCATCAG 3600 

ATTTOtGGAA AGAAGACTTG GCTaCATTTA TTGAAGAATT GGAGGCTOTT GAAGCCAAGG 3660 

AAAAACAAGA TGAACAAGTC GCSVCTTOCTG GGAAAGGGGG &iACGCCMG GGGAAAAAAA 3720 

CACAAATGGC TGAAGTTTTG CCTTCTCCGC GTGGTCAAAG AGTCATTCCA OGAATAACCA 3780 

TAGAAATGAA AGCAGAGGCA GAAAAGAAAA ATAAAAAGAA AATTAAGAAT GAAAATACTG 3640 

AAQGAAGCCC TCAAGAAGAT GGTGTOGAAC TAGAAGGCCT AAAACAAAGA TTAGAAAAGA 3900 

AACAGAAAAG AGAACCAGGT ACAAAGACAA AGAAACAAAC TACATTGGCA TTTAAGCCAA 3960 

TCAAAAAAOG AAAGAAGAGA AATOCCTQGC CTGATTCAGA ATCAGATAGG AGCAGTGAOG 4020 

AAAGTAATTT TGA7GTCCCT CCAOGAGAAA CAGAGCCAGG GAQA6CAGCA ACAAAAACM 4080 

AATTCACAAT GGATTTGGAT TCAGATGAAG ATTTCTCAGA TTTTGATGAA AAAACTGAT6 4140 

ATGAAGATTT TGTCCCATCA GATGCTAGTC CACCTAAGAC CAAAACTTCC CCAAAACTTA 4200 

GTAACAAAGA ACTGAAACCA CAGAAAAGTG TOGTGTCAGA CCTTGAACCT GATGATGTTA 4260 

AGGGCAGTGT ACCACTGTCT TCAAGCOCTC CTGCTACACA TTTCCCAGAT GAAACTGAAA 4320 

TTACAAAGCC AGTTCCTAAA AAGAAICIGA CAGTGAA6AA GACAGGAGCA AAAAGTGAGT 4380 

CTTCCACCTC CACTACGGGT GCCAAAAAAA GGGCTGCCCC AAAAGGAACT AAAAOQGATC 4440 

CAGCTTTGAA TTCTGGTGTC TCTCAAAAGC CTGATCCTGC CAAAACCAAG AATOGCOGCA 4500 

AAAGGAAGCC ATCCACTTCT GATGATTCTG ACTCTAATTT TGAGAAAATT GTTTOGAAAG 4560 

CAGTCACAAG CAAGAAATCC AAGGGGGAC3A GTGATGACTT CCATATGGAC TTTGACTCAG 4620 

CTGTGGCTCC TCGGGCAAAA TCTGTACGGG CAAAGAAACC TATAAAGTAC CTGGAAGAGT 4680 

CAGATGAAGA T6ATCTGTTT TAAAATGTGA GGOSATTATT TTAAGTAATT ATCTTACCAA 4740 

GCGCAAGACT GGTTTTAAAG TTACCTGAAG CTCTTAACTT CCTCCCCTCT GAATTTAGTT 4800 

TGGGGAAGGT GTTTTTAGTA CAAGACATCA AAGTGAAGTA AASCCCAAGT GTTCTTTAGC 4860 

TTTTTATAAT ACTGTCTAAA TAGTGACCAT CTCATGGGCyV TTGTTTTCTr CTCTGCTTTG 4920 

TCTGTGTTrT GAGTCTGCTT TCTTTTGTCT TTAAAACCTG ATTTTTAAGT TCTTCTGAAC 4980 

TGTAGAAATA GCTATCTGAT CACTTCAGCG TAAAGCAGTG TGTTTATTAA CX: ATCCA CTA 5040 

AGCTAAAACT AGAGCAGTTT GATTTAAAAG TGTCACTCTT CCTCCTTTTC TACTTTCAGT 5100 

AGATATGAGA TA6AGCATAA TTATCTGTTT TATCTTAGTT TTATACATAA TTTACCATCA 5160 

GATAGAACTT TATGGTTCTA GTACAGATAC TCTACTACAC TCAGCCTCTT ATGTGCCAAG 5220 

TTTTTCTTTA AGCAATGAGA AATTGCTCAT GTTCTTCATC TTCTCAAATC ATCAGAGGCC 5280 

AAAGAAAAAC ACTTTGGCTG TGTCTATAAC TTGACACAGT CAATAGAATG AAGAAAATTA 5340 

GAGTAGTTAT GTGATTATTT CAGCTCTTGA CCTGTCCCCT CTGGCTGCCT CTGAGTCTGA 5400 

ATCTCCCAAA GAGAGAAACC AATTTCTAAG AGGACTGGAT TGCAGAAGAC TOGGGGACAA 5460 

CATTTGATCC AAGATCTTAA ATGTTATATT GATAACCATG CTCAGCAATG AGCTATTAGA 5520 

TTCATTTTGG GAAATCTCCA TAATTTCAAT TTGTAAACTT TGTTAAGAOC TGTC TACATT 5580 

GTTATATGTG TGTGACTTGA GTAATGTTAT CAAOGTTTTT GTAAATATTT ACTATGTTTT 5640 
TCTATTAGCT AAATTCCAAC AATTTTGTAC TTTAATAAAA TGTTCTAAftC ATTGC 

Seq ID NO: 178 Protein sequence: 
Protein Accession Ui NP_00105e.l 

1 11 21 31 41 51 

i I ! 11 I 

MBVSPLQPVN ENKQVNKIKK NEDAKKRLSV ERIYQKKTQIi EHILLRPDTY IGSVELVTQQ 60 

MWVYDEDVGI NYREVTPVPG LYKIFDEILV NAADNKQRDP KMSCIRVTID PENNLISIWN 120 

KGKGIPWEH KVEKMYVPAL IFGQLLTSSN YDDDEKKVTG GRNGYGAKLC NIFSTKFTVE IBO 

TASREYKKMF KOTWMDNMGR AGBMELKPFK GEDYTCITFQ PDLSKFKMQS LDKDIVALMV 240 

RRAYDIAGST XDVKVPLNGN KLPVKGFRSY VDMYLKOKLD ETGNSLKVIH BQVNHRHEVC 300 

LTMSEXGFQQ ISFVNSIATS KGGRHVDYVA DQIVTICLVDV VKKKNKGGVA VKAHQVKNHM 360 

WIFVNALIEN PTFD5QTKEN MTLCPKSFGS TCQLSEKFIK AAIGCGIV5S ZIiNWVKFKAQ 420 

VQLNKKCSAV KHNRIKGIPK LDDANDAGGR NSTECTLILT EGDSAKTLAV SGLGWGRDK 480 

YGVFPLRGKI LNVREASHKQ IMENAEINNI IKIVGLQYKK NYEDEDSUCT LRYGKIMIMT 540 

OQDQDGSKIK GLLIITFIHHH HPSLLRHRFL EEPITPIVKV SKNKQSMAFY SLPEFBEHKS 600 

STPNHKKHKV KYYKGLGTST SKEAKEYFAD MKRHRIQFKY SGPBDDAAIS LAFSKXQIDD 660 

RKEWLTNFHE DRRQRKLLGL PEDYLYGQTT TYLTYNDFIM KELILFSNSO NBRSIPSKVD 720 

GLKPGQRKVL PTCPKRNDKR BVKVAQLAGS VAEMSSYHHG EMSLMMTIIN LAQNFVGSNN 780 

LNLLQPIGQF GTRLHGGKDS ASPRYIFTML SSLARIiLFPP KODHTLKFLY DDNQRVEPEW 840 

YIPIIPMVIil NGAEGIGTGW SCKIPNPDVR EIVNNIRRLM DGEEPLPMLP SYKNFKGTIE 900 

ElAPNQYVIS GEVAILNSTT lEXSELPVRT WTQTYKEQVL EPMLNGTBKT PPLITDYREY 960 

HTDTTVKFW KMTEEKLAEA ERVGLHKVFK LQTSLTCNSM VLFDHVGCLK KYDTVLDILR 1020 

DFFELRLKYY GLRKEWLLGM LGAESAKLNN QARFILBKID GXIIIENKPK KELIKVLIQR 1080 

GYDSDPVKAW KEAQQKVPOE EENEESDNBK ETBKSDSVTD SGPTFNYLLO MPLWYLTKEK 1140 

KDELCRLRNE KEQELDTLKR KSPSDLWKED LATFIEELEA VEAKEKQDEQ VGLPGKGGKA 1200 

KGKKTQMAEV LPSPRGQRVI PRITIEMKAE AEKKNKKKIK NENTEGSPQE DGVELBGLKQ 1260 

RLEKKQKREP GTKTKKQTTL AFKPIKKGKK RNPWPDSESD HSSDESNFDV PPRETEPRRA 1320 

ATKTKFTKDL DSDEOFSOFO EKTQDEDFVP SDASPPKTKT SFKLSIIKELK PQKSWSDLS 1380 

AODVKGSVPL SSSPPATKFP DETEITNPVP KKKVTVKKTA AKSQSSTSTT GAKKRAAPKG 1440 

TKRDPALNSG VSQKPDPAKT KNRRKRKPST SDDSDSNFEK ZVSKAVTSKK SKGESDDFHM 1500 
DFOSAVAPRA KSVRAKKPIK YLEBSDEDDXj F 



Seq ID NO: 179 DNA sequence 

Nucleic Acid Accession Eos sequence 

Coding sequence: 148-7095 

1 11 21 31 41 51 

1 I I I 1 I 

CACACATACG CACGCAOGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 60 
CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AG GAGC OGCA 120 
CX3GCGAGGG6 CGGCAGACO; TCTGGAAATG GGAATCCTAA AG0GTTTCX:T OGCTTGGATT 180 
CABCTCCrCT GTGTTTGCOG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 
CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 
AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 
CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 
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AACftCATTd TTCATAACAC TGGGAAAACA 
GTCAGOSGAG GAGTTTCAGA AATQGTGTTT 
AAATGCAATA TGTCATCTGA TGGATCACAG 
GAGATGCJJU^ TCTACTGCTT TGATGCGGAC 
5 GGAAAAGGOV AGTTAAGA6C TTTATCCA,7T 
GATTTCAAAG OGATTATTGA TGGAGTG6AA 
TTAGATCCAT TCATACTGTT GAACCTTCTG 
AATGGCTCAT TGACATCTCC TCXXTGCACA 
ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT 

10 TCTGGTTATG TCATGCTGAT GGACTACTCA 
TTCTCTAGAC AGGTGTTTTC CTCATACACT 
AGTTCAGAAC OVGAAAATGT TCAGGCTGAC 
TGGGAAAGAC CTOGAGTOGT TTATGATACC 
CAGTTGGATG GAGACGACCA AACCAAGCAT 

15 GGTGCTATTC TCAATAATTT GCTACCCAAT 
TGCACTAATG GCTTATATGG AAAATACAGC 
AATCCTGAAC TTGATCTTTT CCCTGAATTA 
GAAGAGGGAA AAGACATTGA AGAAGGCGCT 
AACCAAATCA GGAAAAAGGA ACCCCAGATT 

20 AOGAAATACA ATGAAGCCAA GA CTAA COGA 
AAGGGTGATG TTCCCAATAC ATCTTTAAAT 
ACAGAAAAAG ATATTTCCTT GACTTCTCAG 
GAAGGTACTT CAGCCTCTTT AAATGATGGC 
AACTTGTCGG GGACTGCAGA ATCCTTAAAT 

25 AGTTTATTGA CCAGTTTCAA GCTTGATACT 
GCAACTTCTG CTATCCCATT CATCTCTGAG 
GAAAACCCAG AGACAATAAC ATATGATGTC 
GAAGATTCAA CTTCATCAGG TTCAGAAGAA 
GTX?rGGTrrc CTAGCTCTAC AGACATAACA 

30 AGCTTTCTCC AGACTAATTA CACTGAGATA 
TCCTTTTCTG CAGGCCCAGT GATGTCACAG 
CATTATTCTA OCTTTGCCTA CTTCCCAACT 
TCCAGACAAC AGGATTTGGT CTCCACGGTC 
GTATACAATG GTGAGACACC TCTTCAACCT 

35 ACCCCTTTGT TGCTTGACAA TCAGATCCTC 
TOSGCCTTGC ATGCTACGCC TGTATTTCCC 
TCTTOCTATG ATGGTGCACX: TTTGCTTCCA 
TTTCGCCATC TGCATACAGT TTCTCAAATC 
GATAAGGTGC CCTTGCATGC TTCTCTGCCA 

40 AGCCTTGCTC AGTATTCTGA T GTGC TGTCC 
TTTGGTAGTG AATCTGGTGT TCTTTATAAA 
AGCAGTGATG CCATGATGCA TGCACGTTCT 
GATAATGAG6 GCTCCCAACA CATCTTCACT 
GATTCTGTGG GTGTAACTTA TCAGGGTTCX: 

45 CCTAAGTCTT CGTTAATAAC CCCAACTGCA 
GGTGATGGGG AATGGTCTGG AGCCTCTTCT 
GGGCTGACAG CCCTTAACAT TTCTTCACCT 
TCTGTGTTTG GTGATGATAA TAAGGCGCTT 
ACTGAACTGC AAATTCCTTC TTTCAATGAG 

50 COCAACATGT ATGATAATGT AAATAAGTTG 
ATTTCTAGCA CCAAGGGCAT GTTTCCAGGG 
GATCATGAGA TTAGTCAAGT TCCAGAAAAT 
TCTCAAGCAT CTGGTGACAC TTCGCTTAAA 
TCCTCTGACC CTGCTTCTAG TGAAATGTTA 

55 ACCTCAGCTT CTTTTAGTAC TGAAGTATTG 
GACACCTTGC TTAAAACTGT TCTTCCAGCT 
CCCAAAGTTG ATAAAATTAG TTCTACAATG 
AGTGAAAACA TGCTGCACTC TACATCTGTA 
ATGCACTCTG CrTCACTTCA AGGTTTGACC 

60 GTTTTGTTAA AAAGTGAAAG TTCCCACCAA 
TTGTTCCAAA CGGCCAATTT GGAGATTAAC 
TTTGCTACAC CTGTTTTATC AATTGATGAA 
CATTCCGATG AAATTTTAAC CTCCACCAAA 
ATTCCAACAG TTGCTTCTGA TACATTTGTA 

65 GGGCATGTTG CCATTACaUSC TGTTTCTCOC 
TTGCTGTTTC CTTCTAAGGC AACTTCTGAG 
TTAGTGGGTG GTGGTGAAGA TGGTGACACT 
AGAGGTAGTG ATCGCTTATC CATTCATAAG 
CA6GAAAAGG TAATGAATGA TTCA6ACACC 

70 CCAATCTCAT ACTCACTATC TGAGAATTCT 
TCAGACAGTC AAACTGGTAT GGACAGAAGT 
TCCCAAAAGC ACAATGATGG AAAAGAG6AA 
CCTCTCAGCC CTGAATCTAA AGCATGGGCA 
GGGCAAGGTA CCTCAGATAG CCTTAATGAG 

75 GACACTAATG AAAAAGATGC TGATGGGATC 
GGATTCCCAC AGTCCCCAAC ATCATCTGTT 
TCAGAG6CAG AGGCCAGTAA TAGTAGGCAT 
GAATC06AGA AG AAGGCAG T TATACCCCTT 
CTAGTGGTTC TTGTGGGTAT TCTCATCTAC 

80 TACTTAGAGG ACAGTACATC CCCTAGAGTT 
ATTTCAGATG ATGTOQGAGC AATTCCAATA 
CATGCAAGTA GTGGGTTTAC TGAAGAATTT 
CAGAGCTGTA CTGTTGACTT AGGTAT1ACA 
CAC3U«SAATC GATACATAAA TATOGTTGOC 

85 CTTGCTGAAA AGGATGGCAA ACTGACTGAT 
AACAGACCAA AAGCTTATAT TGCTGCCCAA 
TGGAGAATGA TATGGGAACA TAATGTGGAA 



GTGGAAATTA ATCTCACTAA TGACtAOOGT 480 

AAAGC3U\GCA AGATAACTTT TCACTGGG6A S40 

CATAGTTTAG AAGGACAAAA ATTTCCACIT 600 

CGATTTTCAA GTTTTGAGGA AGCAGTCAAA 6€0 

TTGTTTGAGG TTGGGACAGA AGAAAATTTG 720 

AGTGTTAGTC GTTTrGGGAA GCAGGCTGCT 780 

CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

GACACACTTG ACTOGATTGT TTTTAAA6AT 900 

(jm ' I ' ATG TG AAGTTCTTAC AA3GCAACAA 960 

CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

CCAGAGAATT ATACCAGCXrT TCTPGTTACA 1140 

ATGATTGAGA AGTrTGCAGT TTTGTACCAG 1200 

GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

GACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

ATTGT6AATC CTGGTAGAGA CAGTGCTACA 1500 

TCTACCACAA CACACTACAA TOGCATAGGG 1560 

TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

TCCACTTCC3C AACCAGTCAC TAAATTAGCC 1680 

ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

ACAGTTTCTA TAACAGAATA TGAGGAGGAG I860 

GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

AACATATOCC AAGGGTATAT ATTTTCCTCC 1980 

CTTATACCAG AATCTGCTAC AAATGCTTCC 2040 

TCACTAAAGG ATOCTTCTAT GGAGGGAAAT 2100 

GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

OGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

GGTOCCTCAG TTACAGATCf GGAAATGCCA 2280 

GAGGTAACAC CTCATGCTTT TAOCCCATCC 2340 

AAOGTGGTAT ACTGGCAGAC AACCCAACCG 2400 

TCCTACAGTA GTGAAGTCTT TCCTCTAGTC 2460 

AACACTACCC CTGCTGCTTC AAGTAGTGAT 2520 

AGTGTCGATG TGTCATTTGA ATCCATCCTG 2580 

TTTTCCTCTG CTTCCTTCAG TAGTGAATTG 2640 

CTTCCACAAG TTACTTCAGC TACOGAGAGT 2700 

GTG6CTGGGG GTGATTTGCT ATTAGAGCCC 2760 

ACTACTCATG CTGCTTCAGA GACGCTGGAA 2820 

ACGCTTATGT TTTCTC3U^GT TGAACCACCC 2880 

TCAGGGCCTG AACCTTCTTA TGCCTTGTCT 2940 

GTTTCTTACA OTTCTGCAAT ACCTGTGCAT 3000 

TTATTTAGCG GCCCTAGCCA TATACCAATA 3060 

TCATTACTGC AGCCTACTCA TGCCCTCTCT 3120 

GATAGTGAAT TTCTTTTACC TGACACAGAT 3180 

GTTTCTGTAO CTGAATTTAC ATATACAACA 3240 

TCTAAAAGTG AAATAATATA TGGAAAT6AG 3300 

ATGGTTTACC CTTCTGAAAG CACAGTCATG 3360 

AATGCGTCTT TACAAGAAAC CTCTGTTTCC 3420 

TCCCTTGCTC ATACCACCAC TAAGGTTTTT 3480 

AACTTTTCAG TTCAACCTAC ACRTACTGTC 3540 

CCTGTGCTTA GTGCAAACTC AGA6CCAGCA 3600 

TCTCCTTCAA CTCAGCTCTT ATTTTATGAG 3660 

CTACAACCTT CCTTTCAGGC TTCTGATGTT 3720 

GTGCCCAGTG ATCCAATATT GGTTGAAACC 3780 

TTGCATCTCA TTGTATCAAA TTCTGCTTCA 3840 

CCAGTTTTTG ATGTGTCGCC TACTTCTCAT 3900 

ATTTCCTATC CAAGT6AGAA ATAT6AACCA 3960 

GTGGTACCTT CTTrGTACAG TAAT6ATGAG 4020 

CAGGCCCATC CCCCAAAAGG AAGGCATGTA 4080 

CCATTAAATA CACTAATAAA TAAGCTTATA 4140 

AGTTCTGTTA CTGGTAAGGT ATTTGCTGGT 4200 

TCTACTGATC ATTCTGTTCC TATAGGAAAT 4260 

GACAGAGATG GTTCTOTAAC CTCAACAAAG 4320 

CTGAGTCATA GTGCCAAATC TGATGCOGGT 4380 

GATGATGATG GTGATGATGA TGATGATGAC 4440 

TGTATGTCAT GCTCATCCTA TAGAGAATCA 4500 

CAOGAAAACA GTCTTATGGA TCAGAATAAT 4560 

GAAGAAGATA ATAGAGTCAC AAGTGTATCC 4620 

CCTGGTAAAT CACCATCAGC AAATGGGCTA 4680 

AATGACATTC AGACTGGTAG TGCTCTGCTT 4740 

GTTCTGACAA GTGAT6AAGA AAGTGGATCA 4800 

AATGAGACTT CCAC3U5ATTT CAGTTTTGCA 4860 

CTGGCAGCAG GTGACTCAGA AATAACTCCT 4920 

ACTAGCGAGA ACTCAGAAGT GTTCCACGTT 4980 

GAGTCTCGTA TTGGTCTAGC TGAGCGGTTG 5040 

GTGATOGTGT CAGCCCTGAC TTTTAT CTGT 5100 

TGGAGGAAAT GCTTCCAQAC TGC ACAC TTT 5160 

ATATCCACAC CTCCAACACC TATCTPTCCA 5220 

AAGCACTTTC CAAAGCATGT TGCAGATTTA 5280 

GAGACACTGA AAGAGTTTTA CCAGGAAGTG 5340 

GCAGACAGCT CCAACCACCC AGACAACAAG 5400 

tATOATCATA CCAGGGTTAA GCTAGCACAG 5460 

TATATCAATG CCAATTATGrT TGATGGCTAC 5520 

GGCCCACTGA AATCCACAGC TGA AfiATT TC 5580 

GTTATTGTCA TGATAACAAA CXrTCGTGGAG 5640 
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AAAGGAAGtSA GAAAATC3TGA TCAGTACTGG CCTGCC3GATG GGAGTGAGGA CT AOGGG AftC 5700 

TTTCTGGTCA CTCAGAAGAG TGTGCAAGTG CTPGCCTATT ATACTGltSAG GAATTTTACT 5760 

CTAAGAAACA CAAAAATAAA AAAGC3GCTCC CAGAAAGGAA GACCCACTGG AOGTGTGGTC 5820 

ACACAGTATC ACTACACGCA GTGGCCTGAC ATGGGAGTAC CAGAGTACTC CCTGCCAGTG 5880 

CTGACCTTTG TGftCAAAGGC AGOCTATGCC AAGOCCCATG CAGTGGGGCC TGTTCTGSTC 5940 

CACTGCAGTG CTGGAGTTGG AAGAACACGC ACATATATTG TGCTAGACAG TATGTT GCftG 6000 

CAGATTCAAC AOGAAQGAAC TGTCAACATA TnGGCTPCT TAAAACACAT COGTTCACAA 6060 

AtSAAATTATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA TTCATGATAC ACTGGTTGAG 6120 

GCCATACTTA GTAAAGAAAC TGAGGTGCTG GACAGTCATA TTCATGCCTA TGTTAATGCA 5180 

CTCCTCATTC CTGGACCAGC AGGCAAAACA AAGCTAGAGA AACAATTCCA GCTCCTGAGC 6240 

CAGTCAAATA TACAGCAGAG TGACTATTCT CCAGCC3CTAA AGCAATGCAA Cft GOGAA AAG 63O0 

AATOGAACTT CTTCTATCAT COCTGTGGAA AGATCAAGGG TTGGCATTTC ATCCCTGAGT 6360 

GGACAAiGGCA CAGACTACAT CAATGCCTCC TATATCATGG GCTATTACCA GAGCAATGAA 6420 

TTCATCATTA CO^GCACCC TCTCCTTCAT ACCATCAAGG ATPrCTGGAG GATGATATGG 6480 

GACCATAATG CCCAACTGGT GGTTATGATT CCTGATGGCC AAAACATGGC AGAAGATGAA 6540 

TKGTTTACr G6CCAAATAA AGATGASCCT ATAAATTGTG AGAGCTTTAA GGTCACTCTT 6600 

ATGGCTGAAG AACACAAATG TCTATCTAAT GAGGAAAAAC T TATAA TTCA GGACTTTATC 6660 

TTAGAAGCTA CACAGGATGA TTATGTACTT GWVGTGAGGC ACTTTCAGTG TCCTAAATGG 6720 

CCAAATCCAG ATAGCCCCAT TAGTAAAACT TTTGAACTTA TAAGTGTTAT AAAAGAflGAA 6780 

GCTGCCAATA GGGATGGGCC TATGATTGTT CATGATGAGC ATG6AGGAGT GAQQGCAGGA 6840 

ACTTTCTOTG CTCTCACAAC CCTTATGCAC CAACTAGAAA AAGAAAATTC CGTGGATGTT 6900 

TACCAGGTAG CCAAGATGAT CAATCTGATG AGGCCAGGAG TCTTTGCTGA CATTGAGCAG 6960 

TATCAGTrrC TCTACAAAGT GATCCTCAGC CTTGT6AGCA CAAGGCAGGA A6AGAATCCA 7020 

TCCACCTCTC TGGACAGTAA TGGTGCAGCA TTGCCTGATG GAAATATAGC TGAGAGCTTA 7080 

GAGTCTTTAG TTTAACACAG AAAGGGGTGG QGGGACTCAC ATCTGAGCAT TGTTTTCCTC 7140 

TTCCTAAAAT TAGGCAGGAA AATCAGTCTA GtTCTGTTAT CTGTTGATTT CCCATCACCT 7200 

GACAGTAACT TTCATCACAT AGGATTCTGC CGCCAAATTT ATATCATTAA CAATGTGTGC 7260 

CTTTTTCCAA GACTTGTAAT TTACTTATTA TGTTTGAACT AAAATCATTG AATTTTACAG 7320 

TATTTCTAAfi AATG6AATTC TGGTATTTTT TTCTGTATTG ATTTTAACAG AAA ATTTCA A 7380 

TTTATAGAGG TTAGGAATTC CAAACTACAG AAAATGTTTG TrTTTAGTGT CAAATTTTTA 7440 

GCTGTATTrG TAGCAATTAT CAGGTTTGCT ACAAATATAA CTITTAATAC AGTAGCCTGT 7S00 

AAATAAAACA CTCTTCCATA TGATATTCAA CATTTTACAA CTGCAGTATT CACCTAAAGT 7560 

AGAAATAATC TCTTACTTAT TGTAAATACT GCCCTAGTGT CTCCATGGAC CAAATTTATA 7620 

TTTATAATTC TAGATTTTTA TATTTTACTA CTGAGTCAAG TTTT CTAGTT CTGTGTAATT 7680 

GTrPAGTTTA ATGACGTAGT TCATTAGCTG GTCTTACTCT ACCAGTTTTC TGACATTGTA 7740 

TTGTCTTACC TAAGTCATTA ACTTTGTTTC AGCATGTAAT TTTAACTTTT GTGGAAAATA 7800 

GAAATACCTT CATTTTGAAA GAAGTTTTTA TGAGAATAAC AOCTTACCAA ACATTGTTCA 7860 

AATGGTTTTT ATCCAAGGAA TTGCAAAAAT AAATATAAAT ATTGCCATTA AAAAAAAAAA 7920 
AAAAAAAAAA AAAAAAAAAA AAAA 

Seq ID NO: 180 Protein sequence: 
Protein Accession ft: Eos sequence 

1 11 21 31 41 51 

HRILKRFLAC IQLLCVCRLD WANGYYRQQR KLVEEIGWSY TGALNQKNWG KKVPTCNSPK 60 

QSPINIDEDL TQVNVNLKKL KFQGWDKTSL ENTPIHNTGK TVBINLTNDY RVSGGVSEMV 120 

FKASKITFHW GKC2JMSSDGS EHSLEX3QKFP LE>IQIYCFDA DRFSSFEEAV KGKGKLRAUS 180 

ILPBVGTBEN LDFKAIIDGV ESVSRFGKQA ALDPFILIiNL LPtJSTDKYVI YNGSLTSPPC 240 

TDTVDWIVFK DTVSISESQL AVFCEVLTHQ QSGYVMLMDY LQUNPRBQQV KPSRQVFSSY 300 

TGKEEIH2AV CSSEPENVQA DPENYTSIAV TWERPRWYD TMIEKFAVtY QQLDGEDQTK 360 

HEFLTDGYQD LGAILNNLLP NMSYVliQiVA ICTNGLYGKY SDQLIVDMPT DNPELDLPPE 420 

LIGTE2IIKB EEEGKDIEEG AIVNPGRDSA TNQIRKKEPQ ISTTTHYNRI GTKYNBAKTH 490 

* RS7TRGSEFS GKGDVPNTSL NSTSQPVTKL ATBKDISLTS QTVTELPPHT VEGTSASU9D S40 

GSKTVLRSPH MSLSGTABSL KTVSITEYEE ESLLTSPKLD TGAEDSSGSS PATSAIPFIS 600 

ENISQGYIFS SENPETITYD VLIPESARNA SEDSTSSGSE BSLKDPSMEG NVWFPSSTDI 660 

TAQPDVGSGR ESPLQTNYTE IRVDESETCTr KSPSAGPVMS QGPSVTDLEM PHYSTPAYFP 720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNGBTPLQ PSYSSBVFPL VTPI.I*LnilQI 700 

UITTPAASSS DSALHATPVF PSVDVSFESI LSSYDGAPIiL PPSSASFSSE LPRHLHTVSQ 840 

ILPQVTSATE SDKVPLHASL PVAGGDLLLE PSLAQYSDVL STTHAASETL EFGSESGVLY 900 

KT1J4FSQVEP PSSDAMMHAR SSGPEPSYAL SDNEGSQHIF TVSYSSAIPV HDSVGVTYQG 960 

SLPSGPSHIP IPKSSI.ITPT ASLLQPTHAL SGOGBMSGAS SDSEPLLPDT DGLTALNISS 1020 

PVSVAEFTYT TSVFGODNKA LSKSEIIYQJ ETBLQIPSPN EMVYPSESTV MPNMYDNVNK 1080 

LNASLQETSV SISSTRC»IFP GSLAHTTTKV PDHEISQVPE IWFSVQPTHT VSQASG0TSL 1140 

KPVLSANSEP ASSDPASSEM LSPSTQLLFY ETSASFSTEV U^PSFQASD VDTLLldVLP 1200 

AVPSDPILVE TPKVDiaSST MLKLIVSNSA SSENMLHSTS VPVPDVSPTS HMHSASLQGL 1260 

TISYASEKYB PVtUCSESSH QWPSLYSND ELFQTANLEI NQAHPPKGSH VFATPVLSID 1320 

EPUrrLINKL IHSDEILTST KSSVTGKVFA GIPTVASDTP VSTDHSVPIG NGHVAITAVS 1380 

PHRDGSVTST KU*PPSKATS ELSHSAKSUA GLVGGGEDGD TDDDGDDDDD DRGSDGLSIH 1440 

KCMSCSSYRE SQEKVMNDSD THENSLMDQN NPISYSLSEN SEEDNRVTSV SSDSQTGMDR ISOO 

SPGKSPSANG LSQKHMDGKE ENDXQTGSAL LPLSPESKAW AVUTSDEESG SQQGTSDSLK 1560 

ENBTSTDFSP ADTNEKDADG ILAAGDSEIT PGFPQSPTSS VTSENSEVFH VSEAEASNSS 1620 

HESRIGLAEG LESEKKAVIP LVIVSALTFI CLWLVGIH YWRKCFQTAH PYLEDSTSPR 1680 

VISTPPTPIP PISDDVGAIP IKHFPKHVAD bKASSGPTEE FETtKEFYQE VQSCTVDUJI 1740 

TADSSNHPDN KHKNRYINIV AYDHSRVKLA QLAEKDGKLT DYIKANYVOG YNRPKAYIAA 1800 

QGPLKSTAED FWRMZWEENV BVIVMITNLV EMGRRKCDQY HPADGSEEYG NPLVTQKSVQ 1860 

VLAYYTVHNP TLRNTKIKKG SQKGRPSGEV VTQYHYTQiWP DNGVPEYSLP VLTPVRKAAY 1920 

AKRHAVGPW VHCSAGVGRT GTYIVLDSML QQIQHEGTVN IFGFLKHIRS QRNYLVQTEE 1980 

QYVFIHDTLV EAILSKETEV LDSHIHAYVN ALLIPGPAGK TKLEKQFQUi SQSNIQQSDY 2040 

SAALKQCNRE KNRTSSIIPV ERSRVCISSL SGBGTDYINA SYIMGYYQSN EFIITQHPLI. 2100 

HTIKDFWIMI MDHNAQLWM IPDGQNMAED EPVYWPNKDE PINCESFKVT LMAEEHKCLS 2160 

UBBKLIIQDF ILEAIQDDYV liEVBHPQCPK HPNPDSPISK TPEblSVIKE EAANRDGPtU 2220 

VHDBKGGVTA GTPCALTTLM HQI.EKESJSVD VYQVAKMINL MRPGVFADIE QYQPLYKVIL 2280 
SIiVSTRQEEN PSTSLDSNGA AIiPOGMXAES LESLV 

Seq ID NO: 181 DNA sequence 

Nucleic Acid Accession ft: Eos sequence 
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Coding segueace: 148-4518 

1 11 21 31 41 SI 

1 I 1 1 1 } 

CaCRCATACG CACGCACGAT CrCACTTOGA TCTATACACP GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCftG ACGAGOOGCA 120 

CQGOGAGGGG COGCAGACCG TCTGGAAATG CX3AATCCTAA AGOGTTTCCT OGCTTGCATT 180 

CAGCTCCTCT GTCTTPGCXE CCTGGATTGC GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAfiTAAATG TGAATCTTAA GAAACTTAAA TTTCRCGGTT OGGATAAAAC ATCATTCGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACOGT 480 

GTCAGCGGAG GAGTTTCAGA AATGGTGTTT AAACCAAGCA AGATAACTTT TCACTCGGGA 540 

AAATGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATCCAAA TCTACTGCTT TGATGCGGAC CGATTTTCAA GTTTTGAGGA AGCAG TCAAA 660 

GOAAAAOGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAG OGATTATTGA TGGAGTOGAA AGTGTTAGTC G1Tl"fGGGAA GCAG6CTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCX: TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTAIG TCATGCTGAT 6GACTACTTA CAAAACAATT TTOSAGAGCA ACAG TACAAG 1020 

TTCTCTAGAC AGGTCTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AfiTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTOTTACA 1140 

TGGGAAAGAC CTCGAGTCGT TTATGATACC ATGATTGAGA AtSTTTGCAGT TTTGTAOCAG 1200 

CAGTTGGATG GAGAGGACCA AACCAAGCAT GAATTTTTQA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTOGACAT GCCTACTGAT 1380 

AATCCTGAAC TTOATCrTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGCGCT ATTGTCAATC CTGGTAGAGA CAfiTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TC6CATAGGG 1560 

AOGAAATACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACACAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 
GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG ' 1800 

AACTTGTCJGG GGACTGCAGA ATCXTTTAAAT AOVGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCA6 AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

GTGTCGTTTC CTAGCTCTAC AGACATAACA GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC A6ACTAATTA CACTGAGATA OGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATOC 2340 

TCCAGACAAC AGGATTTGGT CTCCACGGTC AAOGTGGTAT ACTCGCAGAC AACCCAACOG 2400 

GTATACAATG CAGAGGCCAG TAATAGTAGC CATGAGTCTC GTATTGGTCT AG CTGAG GQG 2460 

TTGGAATCCX5 AGAAGAAGGC AGTTATACCC CTTGTGATCG TGTCAGCCCT GACTTTTATC 2520 

TGTCTAGTGG TTCTTGTGGG TATTCTCATC TACTGGAGGA AATGCTTCCA GACTGCACAC 2580 

TTTTACTPAG AGGACAGTAC ATCCCCTAGA GTTATATCCA CACCTCCAAC ACCTATCTTT 2640 

CCAATTTCAG ATGATGTCGG AGCAATTCCA ATAAAGCACT TTCCAAAGCA TGTTGCAGAT 2700 

TTACATGCAA GTAGTGGGTT TACTGAAGAA TTTGA6AGAC TGAAACSAGTT TTACCRG6AA 2760 

GTGCAGAGCT GTACTGTTGA CTTAGGTATT ACAGCAGACA GCTCCAACCA CCCAGACAAC 2820 

AAGCACAAGA ATCGATACAT AAATATCGTT GCCTATGATC ATAGCAGGGT TAAGCTAGCA 2880 

CAGCTTGCTG AAAAGGATGG CAAACTGACT GATTATATCA ATOCCAATTA TGTTGATGGC 2940 

TACAACAGAC CAAAAGCTTA TATTGCTGCC CAAGGCCCAC TGAAATCCAC AGCTGAAGAT 3000 

TTCTOGAGAA TGATAT6GGA ACATAATGT6 GAAGTTATTG TCATGATAAC AAACCTOGTG 3060 

GAGAAAG6AA GGAGAAAATG TGATCAGTAC TGGCCTGCCG AtGGGAGTGA GGAGTAC6GG 3120 

AACTTTCPGG TCACTCAGAA GAGTGTGCAA GTGCTTGCCT ATTATACTGT GAGGAATTTT 3180 

ACTCTAAGAA ACACAAAAAT AAAAAAGGGC TCCCAGAAAG GAAGACCCAG TGGACGTGTG 3240 

GTCACACAGT ATCACTACAC GCAGTGGCCT GACATGGGAG TACCAGAGTA CTCCCTGCCA 3300 

GIGCTGACCT TTGTGAGAAA GGCAGCCTAT GCCAAGC5GCC ATGCAGTGGG GCCTGTTGTC 3360 

6TCCACTGCA GTGCTGGAGT TGGAAGAACA GGCACATATA TTGTGCTAGA CAGTATGTTG 3420 

CAGCAGATTC AACACGAAGG AACTGTCAAC ATATTTGGCT TCTTAAAACA CATCCGTTCA 3480 

CAAAGAAATT ATTTGGTACA AACTGAGGAG CAATATGTCT TCATTCATGA TACACTGCTT 3 540 

GAGGCCATAC TTAGTAAAGA AACTGAGGTG CTGGACAGTC ATATTCATGC CTATGTTAAT 3600 

GCACTOCTCA TTCXrrGGACC AGCAGGCAAA ACAAAGCTAG AGAAACAATT CCAGCTCCTG 3660 

AGCCAGTCAA ATATACAGCA GAGTGACTAT TCTGCAGCCC TAAAGCAATG GAAGAGGGAA 3720 

AAGAAT06AA CTTCTTCTAT CATCCCTGTG GAAAGATCAA GGGTTQGCAT TTCATCCCTG 3780 

AGTGOAGAAG GCACAGACTA CATCAATGCC TCCTATATCA TGGGCTATTA CCAGAGCAAT 3840 

GAATTCATCA TTACXX»GCA CCCTCTCCTT CATACCATCA AGGATTTCTG GAGGATGATA 3900 

TGGGACCATA ATGCCCAACT GGTGGTTATG ATTCCTGATG GCCAAAACAT GGCAGAAGAT 3960 

GAATTTCTTT ACTGGCCAAA TAAAGATGAG CCTATAAATT GTGAGAGCTT TAAGGT CACT 4020 

CrrATCGCTG AAGAACACAA ATGTCTATCT AATGAGGAAA AACTTATAAT TCAGGACTTT 4080 

ATCTTAGAAG CTACACAGGA TGATTATGTA CTTGAAGTGA GGCACTTTCA GTGTCCTAAA 4140 

TGGCCAAATC CAGATAGCCC CATTAGTAAA ACTTTTGAAC TTATAAGTGT TATAAAAGAA 4200 

GAAGCTGCCA ATAGGGATGG GCCTATGATT GTTCATCATG AGCATGGAGG AG TGAO GGCA 4260 

GGAACTTTCT GTGCTCTGAC AACOCTTATG CACCAACTAG AAAAAGAAAA TTCOGTGGAT 4320 

GTTTACCAGG TAGCCAAGAT 6ATCAATCTG ATGAGGCCAG GAGTCTTTGC T6ACATTGAG 4380 

CAGTATCAGT TTCTCTACAA AGTGATCCTC AGCCTTGTGA GCACAAGGCA GGAAGAGAAT 4440 

CCATCCACCr CTCTGGACAG TAATGGTGCA GCATTGCCTG ATGGAAATAT AGCTGAGAGC 4500 

TTAGAGTCTT TAGTTTAACA CAGAAAGGGG TGGGGGGACT CAC ATCTG AG CATTGTTTTC 4560 

CTCTTCCTAA AATTAGGCAG GAAAATCAGT CTAUTTCrGT TATCTGTTGA TTTCC CATCA 4620 

CCTGACAGTA ACTTTCATGA CATAGGATTC TGOOSCCAAA TTTATATCAT TAACA ATGT G 4660 

TGCCTTTTTG CAAGACTTGT AATTTACTTA TTATGTTTGA ACTAAAATGA TTGAATTTTA 4740 

CAOTATTTCT AAGAATGGAA TTGTGGTATT TTTTTCTGTA TTGATTTTAA CAGAAAATTT 4800 

CAATTTATAG AGGTTAG6AA TTCCAAACTA CWSAAAATGT TTGTTTTTAG TGTCAAATTT 4860 

TTAGCT6TAT T7GTAGCAAT TATCAGGTTT GCTAGAAATA TAACTTTTAA TACAGTAGCX: 4920 

TGTAAATAAA ACACTCTTCC ATAtGATATT CSU^TTTTA CAACTGCAGT ATTCAOCTAA 4980 
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AGTAGJUUVTA ATCTOPrACT TATT GTAAA T ACTGCCCTAG 
ATATTTATAA TTGTAGATTT TTATATTTTA CTACWaGTC 
ATTGTTTAGT TTAATGACGT AGTTCATTAG CTGGTCTTAC 
GTATTGTGTT ACCTAAGTCA TTAACTTTGT TTCAGCATGT 
ATAGAAATAC CTTGATTTTG AAAGAAGTTT TTATtjAGAAT 
TCAAATGGTT TTTATCCAAG GAATTGCAAA AATAAATATA 
AA2UU\AAAAA AAAAAAAAAA AAAAAAA 



Seq ID NO: 162 Protein sequence: 
Protein Accession #: Eos sequence 



PCTA]S02;i2476 



TGTCTCCATG GACCAAATTT 
AAGTTTTCTA GTTCTGTGTA 
TCTAUUAU'fT TTCTGACATT 
AATTTTAACT TTTGTGCAAA 
AACACCTTAC CAAACATTGT 
AATATTOCCA TTAAAAAAAA 



S040 
SlOO 
5160 
5220 
5280 
5340 



NRILKRPLAC 
QSPINIDEDL 
PKASKITFHM 
ILFBVGTEEN 
TDTVDWIVFK 
TGKEEIKBAV 
HBFLTOGYQD 
LIGTEEIIKE 
RSPTRGSEFS 
GSKTVLRSPH 
ENISQGYIFS 
TAQnyVGSGR 
TBVTPHAFTP 
PLVIVSALTF 
PliOfPPKHVA 
VAYDHSRVKL 
VEVIVMITNL 
GSQK6RPS6R 
TGTYIVLDSM 
VLDSHIHAYV 
VERSRVGISS 
MIPDGONMAE 
VLEVRKFQCP 
MHQLEKEMSV 
AALPDGHIAE 



11 
1 

IQhLCVCBhD 
TQVNVNLiQCL 
GKCKMSSDGS 
LDPKAIIDGV 
DTVSISESQL 
CSSEPENVQA 
LGAIUtNLLP 
EEEGKDIEEG 
GKGDVFNTSL 
MNLSGTAESL 
SENPBTZTYD 
ESFLQmyTE 
SSRQQDLVST 
ICLWLVGIL 
DiaASSGPTB 
AQIAEKD6KL 
VEKGRRKOiQ 
WTQYHYTQW 
LQQIQHEGTV 
NALLIPGPAG 
I»SGBGTDyiN 
DEFVYWPNKD 
KWPKPDSPIS 
DVYQVAKMIN 
SLESI.V 



21 
I 

WANGYYRQQH 
KFQGHDKTSL 
EHSLEGQKFP 
ESVSRFGKOA 
AVPCEVLTMQ 
OPENYTSLLV 
NMSWLQIVA 
AIVNPGRDSA 
KSTSQPVTKL 
NTVSITEYEE 
VIiIPSSARNA 
IRVDESEKTT 
VNWYSQTTQ 
lYWRKCFQTA 
EPBTLKEFYQ 
TDVINANYVD 
YWPADGSEEY 
PDKGVPEYSL 
NIFGFLKHIR 
KTKLEKQFQL 
ASYIKGYYQS 
EPIHCESFKV 
KTFELISVIK 
LMRPGVFADI 



31 
I 

KXiVEEIGHSY 
ENTPIHNTGK 
LEMQIYCFDA 
ALDPFILLML 
QSGYVKLMDY 
TWERPRWYD 

TNQIRKKEPQ 
ATEKDISLTS 
ESLLTSFKLO 



41 
I 

TGALMQKMUG 
TVEINLTHDy 
DRFSSFEEAV 
LPNSTDKYYI 
liQNNFREQQY 
TMIEKFAVLY 
SDQLIVDMPT 
ISTTTmOIRI 
QTVTELPPHT 
TGAE33SSGSS 



XSFSAGPVNS 
PVYKAEASNS 
HPYLEDSTSP 
EVQSCTVDLG 
GYKRPKAYIA 
GNFLVTQKSV 
PVLTFVRKAA 
SQRNYLVQTB 
LSQSHZC»SO 
NBFIITQHPL 
TLMABEHKCL 
EEAANRDGPM 
EQyQPLYKVI 



QGPSVTDLEM 



RVISTPPTPI 
ITADSSNHPD 
AQGPLKSTAB 
QVXAYYTVRN 
YAKRHAVGPV 
BQYVFIHDTL 
YSAALKQCNR 
LHTIKDFHRM 
SNEEKLIIQD 
IVHDEHGGVT 
LSLVSTRQEE 



51 
I 

KKYPTQISPK 
RVSGGVSEMV 
KGKGKLRALS 
YKGSLTSPPC 
KFSROVFSSY 
QQLDGEDQTK 
DNPELOLFPE 
GTlCyNEAKTN 
VEGTSASLND 
PATSAIPFIS 
NVWFPSSTDI 
PHYSTPAYFP 
GLESEKKAVI 
FPISDDVGAX 
NKHKMRYIKI 
DFURMIWEHN 
FTLRNTKIKK 
WHCSAGVGR 
VEAILSKETB 
EKNRTSSIIP 
IWDHNAQLW 
PILEATQDDY 
AGTFCALTTL 
KPSTSLDSKG 



Seq ID NO: 183 DNA sequence 

Nucleic Acid Accession ft: EOS sequence 

Coding sequence: 148-4494 




CATTATTCTA 
TCCAGACAAC 
GTAIACAATG 



41 
I 

G6AGGATTAA 
TGAGAA6CAG 
AGOGTTTCCT 
ACTACAGACA 
ATCAAAAAAA 
ATATTGATGA 
GGGATAAAAC 
ATCTCACTAA 
AQATAACTTT 
AAGGACAAAA 
GTTTTGAGGA 
TTGGGACAGA 
GTTTTGGGAA 
CTGACAAGTA 
ACTGGATTGT 
AAGTTCTTAC 
TTCGAGAGCA 
AGATTCATGA 
ATACCAGCCT 
AGTTTGCAGT 
CAGATGGCTA 
TTCTTCAGAT 
TTGTCGACAT 
AAGAAATAAT 
CTGGTAGAGA 
CACACTACAA 
GAGGAAGTGA 
AACCAGTCAC 
AACTGCCACC 
TTCTTAGATC 
TAACAGAATA 
ATTCTTCAGG 
AAGGGTATAT 
AATCTGCTA6 
ATOCTTCTAT 
ATGTTGGATC 
AATCTGAGAA 
TTACAGATCT 
CTCATGCTTT 
ACTCGCACAC 
TTGGTCTAGC 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1060 
1140 
1200 
1260 
1320 
1380 
1440 



51 
I 

AACAAACAAA 
AGGAGCOGCA 

CGCTTGCATT 
ACAGAGAAAA 
TTGGGGAAAG 
AGATCTTACA 
ATCAirGGAA 
TGACTACCGT 
TGACIGGGGA 
ATTTCCACTT 
AGCAGTCAAA 
AGAAAATTTG 
GCAGGCTGCT 
TTACATTTAC 
TTTTAAAGAT 
AATGCAACAA 
ACAGTACAAG 
AGCAGTTTGT 
TCTTGTTACA 
TTTGTACCAG 
TGAAGACTTG 
AGTAGCCATA 
GCCTACTGAT 
CAAGGAGGAG 
CAGTGCTACA 
TCGCATAGGG 
ATTCTCIGGA 
TAAATTAGCC 
TCACACTGTG 
TCCACATATG 
TGAGGAGGAG 
CTCCAGTCCC 
ATTTTCCTCC 
AAATGCTTCC 
0GAGG6AAAT 
AGGCAGAGA6 
GACAACCAAG 
GGAAA7GCCA 
TACCCCATCC 
AACCCAACCG 
TGAGGGGTTG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
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GAATCOQAQ^ AGAAOGCAGT TATACCOCTT GTGATOGTGT CAG00CT6AC TTTTA.T CTGT 2S20 

CTAlGTGGTTC TTCTGGGTAT TCTCATCTAC TGGAOGAAAT GCTTCCRGAC TGCACftCTTT 2580 

TACTTAGAGG ACAGTACATC CCCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 2640 

ATTPCAGATG ATGTOGGAGC AATTCCAATA AAGCACTTTC CAAAGCATGT TGCAGATTTA 2700 

CATGCAACTA GTGGGTTTAC TGAAGAATTT GAGGAAGTGC AGAGCTGTAC TGTTGACT7A 2760 

GGTATTACAG CAGACAfiCTC CAACCACCCA GACAACAAGC ACAAGAATOG ATACATAAAT 2820 

ATOGTTGOCT ATGATCATAG CAGGGTTAAG CTAGCACAGC TTGCTGAAAA GGATQGCAAA 2880 

CTCACTGATT ATATCAATGC CSU^TTAIGTT GATGGCTACA ACAGAOCAAA AGCTTATATT 2940 

GCTGCCCAAG GCCCACT6AA ATCCACAGCT GAAGATTTCT GGAGAATGAT ATQGGAACAT 3000 

AATGTGGAAG TTATTCTCAT GATAACAAAC CTOGTGGAGA AAGGAAGGAG AAAATGTGAT 3060 

CAGTACTOGC CTGCCGATGG GAGTGAGGAG TACX3GGAACT TTCTGGTCAC TCAGAAGAGT 3120 

GTX3CAAGTGC TTGCCTATTA TACTGTGAGG AATTTTACTC TAAGAAACAC AAAAATAAAA 3180 

AAGGGCTCOC AGAAAGGAAG AOCCAGTGGA OGTGTQGTCA CAC AGTATCA CTACAOGCAG 3240 

TGGOCTGACA TGGGAGTACC AGAGTACTCC CTGCCAGTGC TGAOCTTTGT GAGAAAGGCA 3300 

GCCTATGCCA AGCGCCATGC AGTGGGGCCT GTTGTGGTCC ACTGCAGTGC TGGAGTTG6A 3360 

AGAACAGGCA CATATATTCT GCTAGACAGT AIGTTGCAGC AGATTCAACA CGAAGGAACT 3420 

GTCAACATAT TTGGCTTCTT AAAACACATC CGTTCACAAA GAAATTATTT GGTACAAACT 3480 

GAGGAGCAAT ATGTCTTCAT TCATGATACA CTGGTTGAGG CCATACTTAG TAAAGAAACT 3540 

GAGGTCCTCG ACAGTCATAT TCATGCCTAT GTTAATGCAC TCCTCATTCC TGGACCAGCA 3600 

GGCAAAACAA AGCTAGAGAA ACAATTCCAG CTOCTGAGCC AGTCAAATAT ACAGCAGAGT 3660 

GACTATTCTG CAGCCCTAAA GCAATGCAAC AGGGAAAAGA ATGGAACTTC TTCTATCATC 3720 

CCIGTCGAAA GATCAAGGGT TGGCATTTCA TCCCTGAGTG GAGAAGGCAC AGACTACATC 3780 

AATGCCTCCT ATATCATGGG CTATTACCAG AGCAATGAAT TCATCATTAC CCAGCACCCT 3840 

CTCCTTCATA CCATCAAGGA TTTCTGGAGG ATGATATX5GG ACCATAATGC CCAACTGGTG 3 900 

GTTATGATTC CTGATGGCCA AAACATGGCA GAAGATGAAT WmACTG GCCAAATAAA 3960 

GATGAGCCTA TAAATTGTGA GAGCTTTAAC GTCACTCTTA TGGCT6AAGA ACACAAATGT 4020 

CTATCTAATC AGGAAAAACT TATAATTCAG GACTTTATCT TAGAAGCTAC ACAGGATGAT 4080 

TATCTACTTG AAGTGAGGCA CTTTCAGTGT CCTAAATGGC CAAATCCAGA TAGCCCCATT 4140 

AGTAAAACTT TTGAACTTAT AAGTGTTATA AAAGAAGAAG CTGC CAATAG GGATGGGCCT 42 00 

ATGATTGTTC ATGATGAGCA TGGAGGAGTG ACGGCAGQAA CTTTCTGTGC TCTGACAACC 4260 

CTTATGCACC AACTAGAAAA AGAAAATTCC GTGGATGTTT ACCAGGTAGC CAAGATGATC 4320 

AATCTQATGA GGCCAGGAGT CTTTCCTGAC ATTGAGCAGT ATCAGTTTCT CTACAAAGTG 4380 

ATCCTCAGCC TTGTGAGCAC AAGGCAGGAA GAGAATCCAT CCA CCTCT CT GGACAGTAAT 4440 

GGTGCAGCAT TGCCTGATGG AAATATAGCT GAGAGCTTAG AGTCTTTACT TTAACACAGA 4SO0 

AAGGGGTGGG 'gGGACTCACA TCTGAGCATT GTTTTCCTCT TCCTAAAATT AGGCAGGAAA 4560 

ATCAGTCTAG TTCTGTTATC TGTTGATTTC CCATCACCTG ACAGTAACTT TCATGACATA 4620 

GGATTCTCCC GCCAAATTTA TATCATTAAC AATGTGTGCC TTTTTGCAAG ACTTGTAATT 4680 

TACrrATTAT GTTTGAACTA AAATGATTGA ATTTTACAGT ATTTCTAAGA ATGGAATTGT 4740 

GGTATTTTTT TCTGTATTGA TTTTAACAGA AAATTTCAAT TTATAGAGGT TAGGAATTCC 4800 

AAACTACAGA AAATGTTTGT TTTTAGTGTC AAATTTTTAG CTGTATTTGT AGCAATTATC 4860 

AGGTTTGCTA GAAATATAAC TTTTAATACA GTAGCCTGTA AATAAAACAC TCTTCCATAT 4920 

GATATTCAAC ATTTTACAAC TGCAGTATTC ACCTAAAGTA GAAATAATCT GTTACTTATT 4980 

GTAAATACTG CCCTAGTGTC TCCATGGACX: AAATTTATAT TTATAATTGT AGATTTTTAT 5040 

ATTTTACTAC TGAGTCAAGT TTTCTAGTTC TGTGTAATTG TTTAGTTTAA TGACGTAGTT 5100 

CATTAGCTGG TCTTACTCTA CCAGTTTTCT GACATTGTAT TGTGTTACCT AAGTCATTAA S160 

CTTTGTTTCA GCATGTAATT TTAACTTTTG TGGAAAATAG AAAT ACCTT C ATTTTGAAAG 5220 

AAGTTTTTAT GAGAATAACA CCTTACCAAA CATTGTTC3VA ATGGTTTTTA TCCAAGGAAT 5280 

TGCAAAAATA AATATAAATA TTGCCATTAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 5340 
AAA 

Seq ID NO: 184 Protein sequence: 
Protein Accession ft: EOS sequence 

1 11 21 31 41 51 

I 1 I i i I 

MRILKRFLAC IQLLCVCRLD MANGYYRQQR KLVESIGHSY TGALNQKNWG KKYPTQISPK 60 

QSPINIDEDL TQVNVNLKKL KFQGWOKTSL ENTFIHNTGK TVEUTLTHDY RVSGGVSEMV 120 

FKASKITFHW GKCNMSSDGS EHSLEGQKFP LEKQIYCFDA DRFSSFEEAV KGKGKLRALS 180 

II.P2VGTEEN LDPKAIIEX3V ESVSRFGKQA ALDPPILLML LPNSTDKYYI YNGSLTSPPC 240 

TDTVDWIVFK DTVSISESQIj AVFCEVLIWQ QSGYVMLMDY LQHNFREQQY KPSRQVPSSY 300 

TCKEEIHEAV CSSEPENVQA DPBNYTSLLV TWERPRWYD TMIEKFAVLY QQLDGSDQTK 360 

HEFLTDGYQD LGAILNNLLP NMSYVLQIVA ICTNGLYGKY SDQLIVDMPT DNPELDLFPE 420 

LIGTEEIIKE EEEGKDIEBG AIVNPGRDSA TNQIRKKEPQ ISTTTHYNRI GTKYNEAKTN 480 

RSPTRG5EFS GKGDVPNTSL NSTSQPVTKL ATEKDISLTS QTVTBLPPHT VBGTSASIUD 540 

GSKTVLRSPH MKLSGTAESL NTVSITEYEE ESLLT8FKLD TGAEDSSGSS PATSAIPPIS 600 

ENISQGYIPS SEMPETITYD VLIPBSARMA SEDSTSSGSE ESLKDPSME6 NVWFPSSTDI 660 

TAQPDVGSGR ESFIiQTMYTE IRVDESEKTT KSFSAGPVMS QGPSVTDIiEM PHYSTPAYFP .720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNEASNSS HESRIGLAEG LESEKKAVIP 780 

LVrVSALTFI CLWLVGILI YWRKCPQTAH PYLEDSTSPR VISTPPTPIP PISDDVGAIP 840 

IKBPPKHVAD LHASS6FTEE FEEVQSCTVD LGITADSSNH PDNKHKNRYI NIVAYDHSRV 900 

KLAQLAEKDG KLTDYINANY VDGTORPKAY lAAQGPLKST AEDFWRMIWB HNVEVIVMIT 960 

NLVEKGRRKC DQYWPADGSE EYGNPLVTQK SVOVLAYYTV RNFTLRNTKI XKGSQKGRPS 1020 

GRWTQYHYT QWPDMGVPEY SLPVLTFVRX AAYAKRHAVG PWVHCSAGV GRTGTYIVU> 1080 

SMLQQIQHEG TVNIPGFLKH IRSQRNYLVQ TEEQYVPIHD TLVEAILSKE TBVLOSHIHA 1140 

YVNALLIPGP AGKTiCLEKQP QLLSQSNIQQ SDYSAALKQC NREKNRTSSI IPVERSRVGI 1200 

SSLSGBGTDY INASYIMGYY QSNEFIITQH PLLHTIKDPW RMIWDHNAQL WMIPDGQNM 1260 

AEDEFVYHPN KDEPIKCESF KVTLMAEEHK CI«SNESKLII QDPILEATQD OYVLEVRHFQ 1320 

CPRHPNPDSP ISKTFEZiXSV XKBEAANRDG PMIVBDEHGG VTA6TFCALT TIMHQLEKEH 1380 

SVDVYQVAKM INLMRPGVFA DIEQYQPLYK VILSLVSTRQ BE1»PSTSU>S KGAALPDGNI 1440 
AESLESLV 



Seq ID KO: 185 ONA sequence 

Nucleic Acid Accession 8: EOS sequence 

Ooding sequence: 501-4514 

1 11 21 31 41 51 
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kcACRTACG CAOGCACGAT CTCACTTOGA icTATACACP GGAGGATTAA A^JiJCWA 
SSSa^ ATTTCCrrCG CTCCCCCTCC CTCTCCACrC TG^GJ^ "^^^ 

Sgcga^ cogcagacog tctcgaaatg ogaatcctaa agogtttcct cgcttgcatt 

SSoM ^TOCOG OCTOGATTGG GCTAATGGAT ACTACAGACA ACAGfiGAAAA 
S^TACA GGAGCACTGA ATCAAAAAAT TGGGGJAAC^ 

^taS^ atgtaatagc ccaaaacrat ctcctatcaa tattcatgaa gatctt^ 

GAATCTTAAG AAACTTAAAT TTCAGGGTTC GGATAAAACA TCmCWAA 

^Jatat^ tcaSacact gggaaaacag tggaaattaa tctcactaat gactacostg 

AAGCAAGCAA GATAACTTTT CACTOG^ 

I^^^ ^oSotm ggatcagagc atagtttaga aggacaaaaa tttccacttg 

AGMGraAAT CTACTGCTTT GATGOQGACC GATTTTCAAG TTTTGAGGAA GCAGTCAAAG 

ATTTCAAAGC GATTATTGAT GGAGTOGAAA OTGTTAGTOG TTTTGGGAAfi CAOQCTGCTr 
tIIS^ StACT^ AACCTTCTGC CAAACrCAAC TGACAAGTAT TACATTTA^ 
ATWCT^ GACATCTCCT CCCTGCACAG ACACAGTTGA CTGGATTGTT TTTAAAGATA 
CAmAGCAT CTCTGAAAGC CAGTTGGCTG TTTTTTGrGA AGTTQTTACA ATGCAACAAT 
^S^TG ScraOTAC AAAACAATTT TOGAGAGCAA CAGTJ^ 
^^J^ ^™ TCATACACTG GAAAGGAAGA GATTCAT^ ^25I!!5!t 

Sto^cc AGAAAATGTT caggctgacc cagagaatta taccagcctt cttgttocat 

SgA^OC TOGAGTCGTT TATGATACCA TGATTGAGAA GTTTGCAGrT TTCTAOCAGC 

StqgmS aga^ccaa accaagcatg AATrrrraAC agatggctat caagactegg 
SSSS??? ^^tSg ctacccaata. tcagttatct tcttcagata gtagccatat 

ml^^ Iaatacagcg accaactgat tgtcgacatg cctactcata 
SotgaaS TGATcrrrrc cctgaattaa ttogaactga agaaataatc aag^gagg 
^Ig^aaa agaSttgaa gaaggcgcta ttgtcaatcc tcgtagagac agtgctacaa 

SS^^Saa cccSgStt ctaccacaac acactacaat cgcataggga 

SSJJ^IS^ ACTAACOGAT CCCCAACAAG AGGAAGTGAA TTCTCTGGAA 

^TGATGT TCraATACA TCTTTAAATT CCACTTCCCA ACCAfTTCACT AAATTAGCCA 

?SSottg acttctcaga ctgtgactga actgccacct CACACTGTGG 

SS?^ AGCCTCTTTA AATGATGGCT CTAAAACTGT TCTTAGATCT CCACATATGA 

J^Sgg gactccagaa tccttaaata cagtttctat aacagaatat g^gaggaga 
ctttS^ cagittcaag cttgatactg gagctgaaga ttcttcaggc tccagtcccg 
S?2SStc atctctgaga acatatccca agggtatata ttttcctcog 

S^SSaS GACAATAACA TATGATGTCC rrATACX:AGA ATCTGCTAGA AATGCTTCOG 
i^ATra^ TTCATCAGGT TCAGAAGAAT CACTAAAGGA TCCTTCTATG GAGGGAAATG 
i^TCGT^ TAGCTCTACA GACATAACAG CACAGCCCGA TGTTGGATCA GGCAGAGAGA 
GACTAATTAC ACTGAGATAC GTGTTGATGA ATCTGAGAAG ACAACCAAGT 

"^^^ ag^Stc ATGTCACAGG GTCCCTCAGT tacagatcig G^i?^ 
StStctac ctttgcctac ttcccaactg aggtaacacc tcatgctttt a^ccatot 
SaSSaS ggatttggtc tccacggtca acgtggtata ctcgcagaca acx:caaccgg 

^S^T agtagccatc agtctcgtat tggtctagct gaggggttgg 
IJ?^^ ^gcagtt ataccocttg tgatcgtgtc agccctgact tttatctgtc 
^^^TTCT tgtgggtatt ctcatctact ggaggaaatg cttccagact gcacactttt 
I^tagagga cagtacatcc cctagagtta tatccacacc tccaacacct atctttccaa 
^ttcagatga tctcggagca attccaataa agcactttcc aaagcatgtt gc^ttwc 
atgcaagtag tgggtttact gaagaatttg agacactgaa agagttttac caggaagtgc 
JSgctctac tgttgactta ggtattacag cagacagctc caaccaccca gaoa^agc 
^Saatcg atacataaat atcgttgcct atgatcatag cagggttaag ctagcacagc 

^tggcaaa ctgactcatt atatcaatgc caattatgtt gatggctaca 
ISga^caaa agcttatatt gctgcccaag gcccactgaa atccacacct gaagwttct 

GGAGAATCAT ATGGGAACAT AATGTGGAAG TTATTOTCAT GAWACAAAC CTCGTOGAGA 
AAOGAAGGAG AAAATGTGAT CAGTACTGGC CTGCCGATGG GAGTGAGGAG TaCGGGAACT 
TTCTTCTCAC TCAGAAGAGT GTGCAAGTGC TTGCCTATrA TACTGTGAGG AATTTTACTC 

taSSacac aaaaataaaa aagggctccc agaaaggaag acccagtgga cgtgtggtca 

ScAGTATCA CTACACGCAG TGGCCTGACA TGQ6AGTACC AGAGTACTCC CTGCCAGTGC 
Sc^AAGGCA GCCTATGCCA AGCGCCAT6C AGTGGGGCCT GTTGTOGTCC 

IS^Sotgc tggagttgga agaacacgca catatattgt gctagacagt atgttgcagc 

AGATTcScA CGAAGGAACT GTCAACATAT TTGGCTTCTT AAAACACATC ^T^GACAAA 
SaStATTT GGTACAAACr GAGGAGCAAT ATGTCTTCAT TCATGAT^CA CTGGTTGAGG 
CCMACTTAG TAAAGAAACT GAGGTGCTGG ACAGTCATAT TCATGCCTAT GTTAATGCAC 
$OT^CC TGGflCCAGCA G6CAAAACAA AGCTAGAGAA ACAATTCCAG CTCCTGAGCC 

agSa^S acagcagagt gactattctg cagccctaaa gcaatgcaac agggaaaaga 
^JS^ ttSatcatc cctgtggaaa gatcaagggt tggcatttca tccctgagtg 

^^^C I^ACATC AATGCCTCCT ATATCATGGG CTAT^C^ "^1^ 

tSt^ac ccagcaccct ctccttcata CCATCAAGGA TTTCTGGAGG ATGATATGGG 
IcmAMGC Saactggtg gttatgattc ctgatggcca aaacatcgca gaagatgaat 

?i^ACTO GOCAAATAAA GATCAGCCTA TAAATTGTGA GAGCTTTAAG GTCACTC™ 
J^OTGAAGA ACACAAATGT CTATCTAATG AGGAAAAACT TATAATTCAG GACTTTATCT 
^^^i^ ACAGGATGAT TATGTACTTG AAGTGAGGCA CTTTCAGTGT CCTAAATGGC 
l^^^GA TAGCCCCATT AGTAAAACTT TTGAACTTAT AAGIGTTATA AAAGAAGAAG 
CTGCCAATAG GGATGGGCCT ATGATTGTTC ATGATGAGCA TGGAGGAGTG A0GGCAG6AA 
CTTTCTGTGC TCTGACAACC CTTATGCACC AACTAGAAAA AGAAAATTCC GTGGATGTTT 
ACCAGGTAGC CAAGATGATC AATCTGATGA GGCCAGGAGT CTTTGCTGAC ATT6AGCAGT 
ATCftGTTTCT CTaSaAGTO ATCCTCAGCC TTGTGAGCAC AAGGCAGGAA GAGAATCCAT 
CCACCTCTCT GGACACTAAT GGTGCAGCAT TGCCtGATGG AAATATAGCT GAGAGCTTAG 

agtottagt ttaacacaga aaggggtggg gggactcaca tctgaccatt gttttcctct 

TCCTAAAATT AGGCAGGAAA ATCAGTCTAG TTCTGTTATC TGTTGATTTC CCATOVCCTG 

acagtaactt tcatgacata ggattctgcc gccaaattta tatcattaac aatgtctgoc 

TrrrTGCAAfi ACTTGTAATT TACTTATTAT GTTTGAACTA AAATGATTGA ATTTTACAGrr 
ATTTCTAAGA ATGGAATTGT GGTATTTTTT TCTGTATTGA TTTTAACAGA AAATTTCAAT 
TTATAGAGGT TAGGAATTCC AAACTACA6A AAATGTTTGT TTTTAGTGTC AAATTT^ 

CTCTATmrr agcaattatc aggttigcta gaaatataac ttttaataca gtagcctota 

AATAAAACAC TCTTCCATAT GATATTCAAC ATTTTACAAC TGCAGTATTC ACCTAAAGTA 

SSXtct gttacttatt gtaaatactc ccctagtgtc tccatggacc aaattotat 

TTATAATTCT AGATTTTTAT ATTTTACTAC TGAGTCAAGT TTTCTAGTTC TCTGTflATXG 
TOAGTtSa TGACGTAGTT CATTAGCTGG TCTTACTCTA CCAGTITTCT GACATTCTAT 
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TGTCTTACCr AAGTCATTAA CTTTGTTTCA GCATGTAAIT TTAAC rmvi TGGAAAATAG 5220 

AAAIAOCTTC AnTKlAAAG AAGTTTTTAT GAGAATAACA CCTTACCAAA caTTGTTCAA 5280 

ATGGTTTTTA TCOUWGAAT IGCAAAAATA AATATAAATA TTGCCATTAA AAAAAAAAAA 5340 
AAAAAAAAAA AAAAAAAAAA AAA 

Seq ID NO: 186 Proteia sequence-. 
Protein Accession ft: EOS sequence 

1 11 21 31 41 51 

I 1 1 I 1 i 

^4VFKASKITF HWGKCNMSSD GSEKSLBGQX PPLENQIYCP DABRPSSFEE AVKGKGKLBA 60 

LSILFEVGTE EKLDPKAIID GVBSVSRTOK QAALDPPILL KLLPNSTDKY YIYNGSLTSP 120 

PCTOTVDWrv FKDTVSISES QLAVFCEVLT KQQSGYVMLM DYliQNNFREQ QYKPSRQVPS 180 

SrrcKEEIHB AVCSSEPENV QADPBNYTSL LVTWBRPRW YDTMIEKFAV LYQQIJXSDQ 240 

TKHBFLTDGY QDLGAILNNL LPliMSYVLQI VAICTNGLYG KYSDQIilVDM PTONPELDW 300 

PELIGTEEII KEEEEGKDIE BGAIVNPG8D SATNQIRKKE PQISTTTHYN RrCTKYNEAK 360 

TNRSPTRGSE PSGKGDVPNT SLNSTSQPVT KLATBKDISL TSQTVTELPP RTVEGTSASL 420 

NDGSKTVLRS PHMSLSGTAE SLNTVSITEY BEBSLLTSPK LDTGAEDSSG SSPATSAIPF 480 

ISEKISQGYI PSSENPETIT YDVLIPESAH NASEDSTSSG SEESIiKDPSM BSMVWPPSST 540 

DITAQPDVGS GRESFLQTNY TEIRVDESEK TTKSPSAGPV MSQGPSVTDL EMPHYSTFAY 600 

FPTEVTPHAP TPSSRQQDLV STVNWYSQT TQPVYNEASN SSHESRIGLA EGLESBKKAV 660 

IPLVIVSALT PICLWLVGI LIYWHKCFQT AHPYLEDSTS PRVISTPPTP IFPISDDVGA 720 

IPIKHFPKHV ADUIASSGFT EEFSTLKEFY QEVQSCTVDL GITADSSNHP DJJKHKNRYIN 780 

IVAYDHSRVK LAQIAEKDGK LTDYXHANYV DGYNRPKAYI AAQGPIiKSTA EDFWRMIWBH 840 

NVBVIVMITII LVBKGRRKCD QYWPADGSEB YGSFLVTQKS VQVIAYYTVR NFTLRIITKIK 900 

KGSQKGRPSG RWTOYHYTQ WPDKGVPEYS LPVLTFVRKA AYAKRHAVGP WVHCSAGVG 960 

RTGTYIVLDS MLQQIQHEGT VNIPGFLKHI RSQRNYLVQT EEQYVFIHDT LVEAILSKET 1020 

BVLDSHIHAY VNALIiIPGPA GKTKLEKQPQ liLSQSNIQQS DYSAALKQCN REKNRTSSII 1080 

PVERSRVGIS SL9GSGTDY1 MASYIKGYYQ SNEPIITQHP UiHTIKDFWR MIWDHNAQLV 1140 

VMIPDGQNMA EDEFVYWPNK DEPINCESPK VTLMAEmKC LSMEBKLIIQ DFILBATQDD 1200 

YVLEVRHFXIC PKWPNPDSPI SRTPELISVI KEEAANRDGP MIVHDBHGGV TAGTFCALTT 1260 

LMHQLEKENS VDVYQVAKMI NLMRPGVFAD IBQYQPI.YKV ILSLVSTHQE ENPSTSUISH 1320 
GAALPDGNIA ESLSSliV 

Seq ID NO: 187 DNA sequence 

Nucleic' Acid Accession EOS sequence 

Coding sequence: 148-4632 

1 11 21 31 41 51 

I 1 I 1 I I 

CACACATACG CACGCACGAT CTCACTTCXy^ TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTOCACTC TGAGAAGCAG AGGAGCCGCA 120 

C6G0GAGGGG CXGCAGACCG TCTGGAAATG CGAATCCTAA AAOGTTTCCT CGCTTGCATT 180 

CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTG6GGAAAG 300 

AAATATCCAA CATCTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGTAAATO TOAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACOGT 480 

GTCAGCGGAG GAGTTTCA6A AATGGTGTTT AAAGCAAGCA AGATAACTTT T CACTGG GGA S40 

AAATGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGATGCGGAC CGATTTTCAA GTTTTGAGGA AGCA6TCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAG CGATTATTGA TGGAGTCGAA AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTOC TCCCTGCACA GACACA6TTG ACPGGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTCGAGTCGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGT06ACAT GCCTACPGAT 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGCGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA ISOO 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TOGCATAGGG 1560 

AOGAAATACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TOCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTOTGACTO AACTGCCACC TCACACTGTG 1740 

.GAAGGTACTT CACCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

ARCTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG I860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTA3 GGAGGGAAAT 2100 

GTGTGCTTTC CTAGCTCTAC AGACATAACA GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA OGTGTTGATG AATCTGAGAA GACAACCAA6 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCXXTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACXXXIATCC 2340 

TCCAGACAAC AGGATTTGGT CTCCACGGTC AA0GTX3GTAT ACTCGCAGAC AACCCAACOG 2400 

GTATACAATG AGGCCAGTAA TAGTAGCCAT GAGTCTOSTA TTGGTCTAGC TGAGGGGTTG 2460 

GAATCOGAGA AGAAGGCAGT TATACCCCTT GTGATOGTGT CAGCCCTGAC TTTTAT CTGT 2520 

CTAGTGGTTC TTGTGGGTAT TCTCATCTAC TGGAGGAAAT GCTTCCAGAC TGC ACAC TTT 2580 

TACTTAGAGG ACAGTACATC CCCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 2640 

ATTTCAGATG ATGTCGGAGC AATTCCAATA AAGCACTTTC CAAACCATGT TGCAGATTTA 2700 

CATCCAAGTA GTGGGTTTAC TGAAGAATTT GAGACACTGA AAGAGTTTTA CCAGGAAGTG 2760 

CAGAGCTGTA CTGTTCACTT AGGTATTACA GCAGACAGCT CCAAOCACCC AGftCAACAAG 2820 
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CACAAGAATC GATACATAAA TATOGTTGCC TATGATCATA GGAGGGTTAA GCTAGCACAG 2880 

CTTGCTGAAA AGGATGGCAA ACTGACTGAT TATATCAATG OCAATTATGT TGATGGCTAC 2940 

AACAGACCAA AAGCTTAtAT TGCIGOCCAA GGOOGACTGA AATOCACAGC TGA AGAT WC 3000 

■tGG3UaAATGA TATGGGAACA TAATGIGGAA GTTATTGTCA TGATAACAAA CCTCGTGGAC 30€0 

AAA3GGAAGGA GAAAATGTGA TCAGTACTGG CCTGCCGATG OGAGTGAGGA GTACGGGAAC 3120 

rrrcTGcrrcA ctcacsaacag tgtgcaagtg cttgcctatt atactgtgag g aattttac t 3180 

CTAAGAAACA CAAAAATAAA AAAGGGCTCC CAGAAAGGAA GAOCCAGTG6 AOGll Tragl-C 3240 

ACACAGTATC ACTACAOGCA GT06CCTGAC ATGGGAGTAC CAGAGTACTC CCTGOCAGTG 3300 

CTGACCTTTG TGAGAAAfiGC AGCCTATGCC AAGGGGCATG CAGIGGGGCC TGTTCTCGTC 3360 

CACTGCAGTG CTGGAGTTGG AAGAACAGGC ACATATATTG TGCTAGACAG TATGTTGCAG 3420 

CAGATTCAAC AOGAAGGAAC TGTCAACATA TTTGGCTTCT TAAAACACAT CCGTTCACAA 3480 

AQAAATTATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA TTCATGATAC ACTGGTTGAG 3540 

GGCATACTTA GTAAAGAAAC TGAGGTGCTG GACAC3TCATA TTCATGCCTA TGTTAATGCA 3600 

CICCTCATTC CtQSMXMX AGGCAAAACA AAGCTAGAGA AACAATTCCA GGGTCTCACT 3660 

CTGTCACCCA GGCTGGAGTG CAGAGGCACA ATCPOGGCTC ACTGCAACCT TCCTCTCCCT 3720 

GGCTTAACTC ATCCTCCTAC CTCAGCCTCC 0GAGTGGCT6 GGACTATACT CCTGAGGCAG 3780 

TCAAATATAC AGCAGAGTGA CTATTCTGCA GCOCTAAAGC AATGCATVCAG GGAAAAGAAT 3840 

CGAACTTCTT CTATCATCCC TGTGGAAAGA TCAAGGGTTG GCATTTCATC CCTGAGTGGA 3900 

GAAGGCACAG ACTACATCAA TGCCTCCTAT ATCATGGGCT ATTACCAGAG CAATGAATTC 3960 

ATCATTACCC AGCACCCTCT CCTTCATACC ATCAAGGATT TCTGGAGGAT GATATGGGAC 4020 

CATAATGOCC AACTGGTGGT TATGATTCCT GATGGOCAAA A CATGG CAGA AG ATGAAT TT 4080 

GTTTACTGGC CAAATAAAGA TGAGCCTATA AATTGrGAGA GCTTTAAGGT CACT CTTATG 4140 

GCTGAAGAAC ACAAATGTCT ATCTAATGAG GAAAAACTTA TAATTCAGGA CTTTATCTTA 4200 

GAAGCTACAC AGGATGATTA TGTACTTGAA GTGAGGCACT TTCAGTGTCC TAAATGGCCA 4260 

AATCCAGATA GCCCCATTAG TAAAACTTTT GAACTTATAA GTGTTATAAA AGAAGAAGCT 4320 

GCCAATAG6G ATGGGCCTAT GATTGTTCAT GATGAGCATG GAGGAGTGAC GGCA GGAA CT 4380 

TTCTGTGCTC TGACAAOCCT TATGCACCAA CTAGAAAAAG AAAATTCCGT GGATGTTTAC 4440 

CAGGTAGOCA AGATGATCAA TCTGATGAfiG CCAGGAGTCT TTGCTGACAT TCAGCAGTAT 4500 

CAGTTTCTCT ACAAAGTGAT CCTCaGOCTT GTGQGCACAA GGCAGGAAGA GAATCCATCC 4560 

ACCTCTCTGG ACAGTAATGG T6CAGCATTG CXTTGATGCAA ATATAGCTGA GAGCTTAGAG 4620 

TCTTTAGTTT AACACAGAAA GGGGTGGGGG GACTCACATC TGAGCATTGT TTTCCTCTTC 4680 

CTAAAATTAG GCAGGAAAAT CAGTCTAGTT CTGTTATCTG TTGATTTCCC ATCACCTGAC 4740 

AGTAACTTTC ATGACATAGG ATTCTGOOGC CAAATTTATA TCATTAACAA TGTGTGCCTT 4800 

TTTOGAAGAC TTGTAATTTA CTTATTATGT TTGAACTAAA ATQATTGAAT TTTACAGTAT 4860 

TTCTAAGAAT GGAATTGTGG TATTTTTrPC TGTATTGATT TTAACAGAAA ATTTCAATTT 4920 

ATAGAGGTTA GGAATTCCAA ACTACAGAAA ATGTTTGTTT TTAGTGTCAA ATTTTTA6CT 4980 

GTATTTGTAG CAATTATCAG GTTTGCTAGA AATATAACTT TTAATACAGT AGCCTGTAAA 5040 

TAAAACACTC TTCCATATGA TATTCAACAT TTTACAACTG CAGTATTCAC CTAAAGTAGA 5100 

AATAATCTGT TRCITATTGT AAATACTGCC CTAGTGTCTC CATGGACCAA ATTTATATTT 5160 

ATAATTGTAG ATTTTTATAT TTTACTACTG AGTCAA6TTT TCTAGTTCTG TGTAATTGTT S220 

TAGTTTAATG ACGTAGTTCA TTAGCTGGTC TTACTCTACC AGTTTTCTGA CATTGTATTG 5280 

TGTTACCTAA GTCATTAACT TTGTTTCAGC ATGTAATTTT AACTTTTGTG GAAAATAGAA S340 

ATACCTTCAT TTTGAAAGAA GTTTTTATGA 6AATAACACC TTACCAAACA TrGTTCAAAT 5400 

GGTTTTTATC CAAGGAATTG CAAAAATAAA TATAAATATT GCCATTAAAA AAAAAAAAAA S460 
AAAAAAAAAA AAAAAAAAAA A 

Seq ZD NO: 188 Protein sequences 
Protein Accession #: EOS sequence 

1 11 21 31 41 51 

i I » ^ ' ' 

HRItiXRFLAC IQLLCVCRU) HANGYYRQQR KLVBBIGWSY TGALNQKNWG KKYPTCNSPK 60 

QSPINZDBDIi TQVMVNLKKL KPQGWDRTSL ENTPIHNTGK TVEINLTNDY RVSGGVSEMV 120 

FKASKITFHW GKQIKSSDGS EHSLEGQKFP LEMQIYCFDA DRFSSFEEAV KGKGKLRAIjS 180 

ILFEVGTEEN LDFKAIIDGV ESVSRFGKQA ALDPFIUiNL LPNSTDKYYI YNGSLTSPPC 240 

TDTVDWIVFK DTVSISESQI* AVFCEVLTKQ QSGYVMLMDY liQNNFREQQY KFSRQVFSSY 300 

TGKEBIHEAV CSSEPENVQA DPENYTSLIjV TWERPRWYD TMIEKPAVI/Y QQLDGEDQTK 360 

HEPLTDGYQD LGAILNNLLP KMSYVLQIVA ICTNGLYGKY SDQLIVDMPT DNPELDLFPE 420 

LIGTE5IIKE EEEGKDIEBG AIVKPGRDSA TNQIRXKEPQ ISTTTHYNRI GTKYNBAKTN 480 

RSPTRGSBFS GKGDVPNTSL KSTSQPVTKL ATEXDISLTS QTVTELPPHT VEGTSASLKD 540 

GSKTVLRSFH MNLSGTAESI. NTVSITEYEE ESLLTSFKU3 TGAEDSSGSS PATSAIPFIS 600 

ENISQGYIFS SENPBTITYD VLIPESARNA SEDSTSSGSE ESLKDPSMEG NVWFPSSTDI 650 

TAQPDVGSGR ESPLQTNYTE IRVDESEKTT KSFSAGPVMS QGPSVTDLEM PBYSTPAYFP 720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNEASNSS HESRIGLAEG LESEKKAVIP 780 

LVIVSALTFI CtiWLVQXLI Y«RKCFQTAH FYLBDSTSPR VISTPPTPIP PISDDVGAIP 840 

IKHFPKKVAO LBASSGFTEE PETLKEPYQE VQSCTVDLGI TADSSNHPDN KHKNRYINIV 900 

AYDKSRVKLA QLAEKDGKLT DYINANYVDG YNRPKAYIAA QGPLKSTAED FWRMIHEHKV 960 

.EVrVMITNLV EKGRRKCDQY WPADGSEEYG NFLVTQKSVQ VZAyyTVSKP TLRNTKIKKG 1020 

SQKGRPSGRV VTQYHYTQWP DMGVPEYSIiP VLTFVRKAAY AKRHAVGPW VHCSAGVGRT 1080 

GTYIVLDSML QQIQHBGTVN IFGFI*KHIRS QRNYliVQTEE QYVFIHDTLV EAILSKETEV 1140 

LDSKIHAYVN ALLIPGPAGK TKLEKQFQGL TLSPRLECRG TISAHCNLPL PGLTDPPTSA 1200 

SRVAGTILL5 QSVIQQSDYS AALKQOIREK MRTSSIIPVB RSRVGISSLS GEGTDYINAS 1260 

YINGYYQSNE PIITQHPLLR TIKDFWRMIH DHNAQLWMZ FOGQKMAEDE FVYHPNKDEP 1320 

ZKCESFKVTZi MAEEHXCLSN EEKLIZQOFZ Z.EATQODYVL EVSHFQCPKH PNPDSPISKT 1380 

FELZSVZXEE AANRDGFNZV BDEHGGVTAG TFCALTTLMH QLBKSNSVDV YQVAKKXNU4 1440 
RPGVFADI£Q YQFLYKVZLS LVGTRQEafP STSLDSKGAA LPDOTZAESL BSLV 

Seq ID NO: 189 DMA sequence 
Nucleic Acid Accession «: NM_002820 
coding sequence: 304.. 831 

1 11 21 31 41 51 

i I i i I I 

CCGGTTCGCA AAGAAGCTGA CTTCAGAGGG GGAAACTTTC TTCTTTTAGG AGGCGGTTAG 60 

CCCTGTTCCA OGAACCCAGG AGAACTGCTG GCCAGATTAA TTAGACATTG CTATGGGAGA 120 

06TGTAAACA CACTACTTAT CATTGATGCA TATATAAAAC CATTTTATTT TC36CTATTAT 180 
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TTCAGAGGAA GOGCCTCPGA TTTGTTTCTT TTTTCCCTTT TTGCTCTTTC TGGCTGTGTG 240 

GTTTGGAGAA AGCACAGTrG GAGTAGCCGG TTGCTAAATA AGTCCGGAGC GOGAGCGGAG 300 

AOQATGCAGC GGAGACTGGT TCAGCAGTGG AGGGTGGGGG TGTTCXrTGCT GAGCTAGGCG 360 

GTGCCCTCCT GOGGGOGCTC GGTGGAGGGT CTCAGCCGCC GCCTCAAAA6 AGCTGTGTCT 420 

GAACATCAGC TCCTCCATGA CAAGGGGAAG TCCATCCAAG ATTTACGGCG ACGATTCTPC 480 

CTTCACCATC TGATCGCAGA AATCCACACA GCTGAAATCA GAGCTACCTC GGAGGTGTCC 540 

OCTAACTCCA AGCCCTCTCC CAACACAAAG AACCACCC06 TCCGATTTGG GTCTGATGAT 600 

GA6GGCAGAT ACCTAACTCA GGAAACTAAC AAGGTGGAGA CGTACAAAGA GCAGOOGCTC 660 

AAGACAGCTG GOAAGAAAAA GAAAGGCAAG CC06GGAAAC GCAAGGAGCA GGAAAAGAAA 720 

AAAOGGOGAA CTCGCTCTGC CTGGTTAGAC TCPGCjAGTGA CTGGGAGTGG GCTAGAAOGG 780 

GACCACCTGT CTGACACCTC CACAACGTCG CTGGAGCTOG ATTCACGGTA ACAGGCTTCT 840 

CTGGCCC3GTA GCCTCAGCGG GGTGCTCTCA GCTGGGTTTT GGAGCCTCCC TTCTGOCTTG 900 

GCTTGGACAA ACCTAGAATT TTCTCCCTTT ATGTATCTCT ATOGATTGTG TAGCAATTGA 960 

CAGAGAATAA CTCAGAATAT TGTCTGCCTT AAAGCAGTAC OCCCCTACCA CACACACCCC 1020 

TCTCXnCCAG CACCATAGAG AGGOGCTAGA GaXATTCCT CTTTCTCCAC CGTCACCCAA 1080 

CATCAATGCT TTACCACTCT ACCAAATAAT TTGATATTCA AGCTTCAGAA GCTAC3TGA0C 1140 

ATCTTCATAA TTTGCTG6AG AAGTGTATTT CTTCCXXTTA CTCTCACAOC TGGGCAAACT 1200 

TTCTTCAGTG TTTTTCATTT CrTAOGTTCT TTCACTTCAA GGGAGAATAT AGAAGCATTT 1260 

GATATTATCT ACAAACACTG CAGAACAGCA TCATGTCATA AACGATTCTG AGCCATTCAC 1320 

ACTTTTTATT TAATTAAATG TATTTAATTA AATCTCAAAT TTATTTTAAT GTAAAGAACT 1380 

TAAA7TATGT TTTAAACACA TGGCTTAAAT TTGTTTAATT AAATTTAACF CrGGTTTCTA 1440 

OCAGCTCATA CAAAATAAAT GGTTTCTGAA AATGTTTAAG TATTAACTTA CAA6GATATA 1500 

GGrrrrrcTC atgtatcttt ttgttcattg gcaagatgaa ataatttttc tagggtaatg iseo 

GC3GTAGGAAA AATAAAACTT CACATTTAAA AAAAA 



Seq ID NO: 190 Protein sequence: 
Protein Accession 8: NP 002811 



1 11 21 31 41 51 

I I i 1 I 1 

MQRRLVQQKS VAVFLIiSYAV P50GRSVEGL SSHLKRAVSS HQLUIDKGKS IQDLRRRFFL 60 

HKLIAEZHTA EIRATSEVSP NSKPSPNTKN HPVRFGSDDE GRYLTQErNX VETYKEQPLK 120 
TPGKKKKGKP GXRKEQERKK RRTRSAWLDS GVTGSGLEGD HL5DTSTTSL BLDSR 

Seq ID NO: 19.1 DNA sequence 
Nucleic Acid Accession #: XM_0S9328 
Coding sequence: 52.. 1023 

1 11 21 31 41 51 

I I I 1 1 t 

GGGCTGTCCG GCCCACTCCC CTGGGAGCGC GAGCGGTG6A CCCAGGCGGC CATGTCCCGC 60 

CCTCGCATGC GCCTGGTGGT CACCGCGGAC CACTTTGGTT ACTGCCCGCG ACGCGATGAG 120 

GGTATCGTGG AGGCCTTTCT GGCCGGGGCT GTGACCAGCG TGTCCCTGCT GGTCAAOGGT 180 

GCGGCCACGG AGAGCGCGGC GGAGCTGGCC CGCAG6CACA GCATCCCCAC 6GGCCTCCAC 240 

GCCAACCTGT CX33AGGGCC6 CCCOGTGGGT CGGG0CXX3CC GTGGC6CCTC ATCGCTGCTC 300 

GGCCCGGAAG GCTTCTTCCT TGGCAAGATG GGATTCCGGG AGGCGGTGGC 6GC06QAGAC 360 

GTX3GATTTGC CTCAGGTGCG GGAGGAGCTC GAGGCCCflAC TAAGCTGCTT CCGGGAGCTG 420 

CTGGGCy^GGG CCCCCACGCA CXK3GGACGGG CACCAGCACG TGCACGTGCT CXX3^GGOGTG 480 

TGCCAGGTGT TCGCCGAGGC GCTGCAGGCC TATGGGGTGC GCTTTACGCG ACTGCCGCTG 540 

GAGCGCGGTG TGGGTGGCTG CACTTGGCTG GAGGCCCCCG CGCGTGCCTT CGCCTGOGCC 6O0 

GTGGAGOGCG ACGCCCGGGC CGCOGTGGGC CCCTTCTCCC GCCACGGCCT GCGGTGGACA 660 

GA0GCCTTC6 TGGGCCTGAG CACTTGCGGC OGGCACATGT C0GCTCAC06 CGTGTCCGGG 720 

G0CCTG60GC GGGTCCTG6A AGGTACCCTA 60GGGCCACA CCCT6ACAGC GGAGCTGAT6 780 

GCGCACCCCG GCTACCCCAG TGTGCCTCCC ACCGGCGGCT GCGGTGAAOG CCCOGACGCT 840 

TTCTCTTGCT CTTGGGAGCG GCTGCATGAG CTGCGCGTCC TCACXX5CGCC CACGCTGCX3G 900 

GCCCAGCTTG CCCAGGATGG CX3TGCAGCTT TGCGCCCTCX3 ACGACCTGGA CTCCAAGAGG 960 

CCAGGGGAGG AGGTCCCCTG TGAGCCCACT CTGGAACCCT TCCTGGAACC CTCCCTACTC 1020 

TGAOOCCCTA CAGACAACCA AGCACTAATC CXXrTTAGTAC CAAGAAAGGG GAGGCAGGAT 1080 

TTAGTCCTGG CCCAGCCCAG AGCTGGGACC TGGAGCACGA TCTGTTGACT TCCCTGGGTA 1140 

GGACACTGCC ACCTCTGGGC TCAGGTCCTC ATGCCTCCAA ATGGCATCTA GAGTTTGAGC 1200 

AGCCTTCTTG GCTGCAGGCA GGCCTAGCCT GTGGCAGCGG GCTAGGGCCC GCAGAGCATT 1260 

TGGTGCCCCT CCATGTTGCA ATGCAAACAC CTTCACCACT GGGGCAGTGG GGAGAGATGG 1320 
CTATATTAAT AAAATAAOGT GTGTCTTTC 

Seq ID NO: 192 Protein sequence: 
Protein Accession «: XP 059328 



1 11 21 31 41 51 

i I I I I i 

MSRPRMRLW TADOFGYCPR RDEGIVEAFL AGAVTSVSUi VNGAATESAA EZARRHSIPT 60 
GUIANI*SSGR PVGPAHRGAS SLLGPEGFFL GKMGFREAVA AGDVDLFQVR EBLEAQItSCP 120 
RELI^SAPTH ADGHQHVHVL PGVCQVFAEA LQAYGVRPTR LPLERGVGGC TWLEAPARAP ISO 
ACAVERDARA AVGPFSRHGL RWTDAFVGLS TCGHHMSAHR VSGALARVDE GTIAGHTLTA 240 
BLMAHPGY^S VPPTGGCGEG PDAFSCSHER LHBLRVLTAP TLRAQLAQDG VQLCALDDLD 300 
SKRPGEEVPC EPTLEPFLEP SIiL 

Seq ID NO: 193 DNA sequence 

Nucleic Acid Accession «: NM_0OS688.l 

Coding sequence: 126.. 4439 

1 11 21 31 41 51 

I I I I I i 

COGGGGAGGT GGCTCATGCT CGGGAGCGTG GTTGAGCG6C TOGOCSOG G TT GTCCTG6AGC 60 

AOGGGCGCAG GAATTCTGAT GTGAAACTAA CAGTCIGTGA GOCCTGGAAC CTCOGCTCAG 120 
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AGAAGMXSAA G6MATOGAC AXAGGAAAAG 
CIGTGAGGGA GAGAACCAGC ACTTCTGGGA 
GGAGAACTOG ACOGTTGGAA TGCOVAGATG 
TCTCTCTTGA TGCCTCCATG CATTCTCAGC 
5 GAAAGTACCA TCATGGCTTG AGTGCTCTGA 
AOOCAGTGGA CAATGCTGGG crTTTTTCCT 
CC0C3TGTGGC CCACAAGAAG GGGGAGCTCT 
AOaWSTCTTC TGAOGTGAAC TGCAGAAGAC 
AACyrTGOGCC AGAOGCTGCT TCCCTGOGAA 

10 TCATCCTGTC CATOGTGTGC CTGATGATCA 
TCATGGTGAA ACACCTCTTG GAGTATACCC 
TGTTGTTAGT GCTGGGCCTC CTCCTGACGG 
CTTGOGCATT GAATTACOGA ACXGGTGTCC 
TTAAGAAGAT CCTTARGTTA AAGA ACATTA 

15 TTTGCTCCAA OGATGGGCAG AGAATGTTTG 
GAGGACCCX5T TGTTGCCATC TTAGGCATGA 
GCTTCCTGGG ATCAGCTGTT TTTATCCTCT 
TCACAGCATA TTTCAGGAGA AAATGCGTGG 
ATGAAGTTCT TACTTACATT AAATTTATCA 

20 AGAGTCTTCA AAAAATCGGC GAG6AGQAGC 
ftOOGrTATCW: T G T GG GTGTG GCTCCCATTG 
CTGTTCATAT GACCCTGGGC TTCGATCTGA 
TCrrCAATTC CATGACTTTT GCTTTGAAAG 
AAGCCTCAGT GGCTGTTGAC AGATTTAAGA 

25 TAAAGAACAA ACCAGCCAGT CCTCACATCA 
GGGACTCCTC OCACTCCAGT ATCCAGAACT 
ACAAGAGG6C TTCCAGGGGC AAGAAAGAGA 
AQGCGGTGCr GGCAGAGCAG AAW3GCCACC 
CCX3AAGAGGA AGAAGGCAAG CACATCCACC 

30 ACAGCATOGA TCTGGAGATC CAAGAGGGTA 
GTGGAAAAAC CTCTCTCATT TCAGCCATTT 
TTGCAATCAG TGGAACCTTC GCTTATGTGG 
TGAGAGACAA CATCCTGTTT GGG AAGG AAT 
ACAGCTGCT6 CCTGAGGCCT QACCTGGCCA 

35 GAGAGCGAGG AGCCAACCTG AGCX3GTGGGC 
TOTATAGTGA CACGAGCATC TACATCCTGG 
TGGGCAACCA CATCTTCAAT AGTGCTATCC 
TTGTTACCCA CCAGTTACA6 TACCTGGTTG 
GCIGTATTAC GGAAAGAG6C ACCCATGAGG 

40 CCATTTTTAA TAACCTGTTG CTGGGAGAGA 
AAACCAGTGG TTCACAGAAG AAGTCACAAG 
AGGAAAAAGC AGTAAAGCCA GAGGAAGGGC 
GTTCAGTGCC CTGGTCAGTA TATGGTGTCT 
TCCTGGTTAT TATGGCCCTT TTCATGCTGA 

45 GGTTGAGTTA CTGGATCAAG CAAGGAAGCXS 
CCTGGGTGAG TGACAGCATG AAGGACAATC 
CCCTCTCCAT GGCAGTCATG CTGATCCTGA 
GCACXSCTGCG AGCTTCCTCC CGGCTGCATG 
CTATGAAGTT TTTTGACACG ACCCCCACAG 

50 TGGATGAAGT TGAOGTGCGG CTGCCGTTCC 
TGGTGTTCrr CTGTGTGGGA ATGATOGCAG 
GGCCCCTTGT CATCCTCTTT TCAGTCCTGC 
TGAAGCGTCT G6ACAATATC ACX5CAGTCAC 
AGGGCCTTGC CACCATCCAC GCCTACSW^TA 

55 AGCTGCTGGA TGACAACCAA GCTCCTTTTT 
CTGTGCGGCT GGACCTCATC AGCATCGCCC 
TTATGCACGG GCAGATTCCC CCAGCCTATG 
TAACGGGGCT GTTCCAGTTT ACGGTCAGAC 
OGCTGGAGAG GATCAATCAC TACATTAAGA 

60 AGAACAAGGC TCCCTCGCCT GACTGGCCCC 
AGATGAGGTA CCGAGAAAAC CTCCCrCTTG 
CTAAAGAGAA GATTGGCATT GTGGGGCGGA 
CCCTCTTCCG TCTGGTGGAG TTATCTGGAG 
GTGATATTGG CCTTGCCGAC CTCOGAAGCA 

65 TGTTCAGTGG CACTGTCAGA TCRAATTTGG 
TTTGGGATGC CCTGGAGAGG ACACACATGA 
TTGAATCTGA AGTGATGGAG AATGGGGATA 
GCATAGCTAG AGCCCTGCTC CX3CCACTGTA 
CCATGGACAC AGAGACAGAC TTATTGATTC 

70 GTACCATGCT GACCATTGCC CATCGCCTGC 
TGCTGGCCCA GGGACAGGTG GTGGAGTTTG 
GTTCCOGATT CTATGCCATG TTTGCTGCTG 
TCCTOXTGT T6A0GAA6TC TCTTTTCTTT 
CCCCTCATOG CGTCCTCCTA CCX»AACCTT 

75 GTTCCGGATT GGCTTGTGTG TTTCACTTTT 
ATTCCATATT CATGTAAACA AAATTTAGTT 
GGGAACCGTT ATTATAATTG TATCAGAGGC 
TCTATATATA ATTCTGTACA TAGCCTATAT 
TATTAAAATA AGCACTGTGC TAATAACAGT 

80 TTGCTGTACT AGAGATCTGG TTTTGCTATT 
CTCTAGCTGG TGGTTTCACG GTGCCAGGTT 
ATAGTGGGCC CTCCGACAGC CCCXTTCTGCC 
GAGACGGGTG GGCGGCTGGA GACCATGCAG 
CTGTCCTGGT GTCACTTACT GTTTCTGTCA 

85 TTTCACTCCC TCCATCAAGA ATGGGGATCA 
TTTCCTCOCT TCTTCTTTTT GCTGTTGTTT 
TCCCACTGCC TCAGGITCCT ATGGCTGGCC 



AGTATATCAT CCCCAGTCCT GGGTATAGAA 180 

OGCACAGAGA COGTG A AGAT TCCAAGTTCA 240 

CCTTGGAAAC AGCAGOOOGA GCCGAOGGOC 300 

TCAGAATCCT GGAT6AGGAG CATOOCAAGG 360 

AGCCCATCCG GACTACTTCC AAACUX3UX: 420 

GTATGACTTT TTOGTGGCTT TCTTCTCPGG 480 

CAATGGAAGA OGTGTGGTCT CTGTCCAAGC 540 

TAGAGAGACT GTGGCAAGAA GAGCTGAA3G 600 

GGGnCIGTG GATCTTCTGC CSGCAOCAflGC 660 

CGCAGCTGGC TGGCTTCAGT GGACCAGCXTT 720 

AGGCAACAGA GTCTAACCTG CAGTACAGCT 780 

AAATCGTGCG GTCTTGGTaS CTTGCACTGA 840 

GCTTGCGGGG GGCCATCCTA ACCATGGCAT 900 

AACAGAAATC CCTGGGTGAG CTCATCAACA 960 

AGGCAGCACC CGTTGGCAGC CTGCTGGCTG X020 

TTTATAATGT AATTATTCTG GGAOCAACAG 1080 

TTTACCCAGC AATGATGTTT GCATGAOOGC X140 

CCGCCAOGGA TGAACGTGTC CAGAAGATGA 1200 

AAATGTATGC CTGGGTCAAA GCATTTTCTC 1260 

GTOGGATATT GGAAAAAGCC GGGTACTTOC 1320 

tOOTGGTGAT TGOCAGGGtG GTGAOCTrCT 1380 

CAGCAGCACA GGCTTTCACA GTGGTGACAG 1440 

TAACACCGTT TTCAGTAAA6 TCCCTCTCAG 1500 

GTTTGTTTCT AATGGAAGAG GTTCACATGA 1560 

AGATAGAGAT GAAAAATGCC ACCTTGGCAT 1620 

OGCCCAAGCT GACCCCCAAA ATGAAAAAAG 1680 

AGGTGAGGCA GCTGCAGCGC ACTGAGCATC 1740 

TCCTGCTGGA CAGTGACGAG OQGCCCAGTC 1800 

TGGGCCACCT GCGCTTACAG AGGACACTGC 1860 

AACTGGTTGG AATCTGOGGC AGTGTGGGAA 1920 

TAGGCCAGAT GACGCTTCTA GAGGGCAGCA 1980 

CCCAGCAGGC CTGGATCCTC AATGCTACTC 2040 

ATGATGAAGA AAGATACAAC TCIGT6CTGA 2100 

TTCTTCCCAG CAGGGACCT6 AGGGAGATTG 2160 

AGCJGCCAGAG GATOWSCCTT GCCOQOGCCT 2220 

ACGACCCCCT CAGTGCCTTA GAT6CCCATG 2280 

GGAAACATCT CAAGTCCAAG ACAGTTCTGT 2340 

ACTGTGATGA AGTGATCTTC ATGAAAGAGG 2400 

AACTGATGAA TTTAAATGGT GACTATGCTA 2460 

CACOGCCAGT TGAGATCAAT TCAAAAAAGG 2520 

ACAAGGGTCC TAAAACAGGA TCAGTAAAGA 2580 

AGCTTGTGCA GCTGGAACAG AAAGGGCAGG 2640 

ACATCCAGGC TGCTGGGGGC CCCTTGGCAT 2700 

ATGTAGGCAG CACCGCCTTC AGCACCTGGT 2760 

GGAACACCAC TGTGACTOGA GGGAACGAGA 2 620 

CTCATATGCA GTACTATGOC AGCATCTAOa 2880 

AAGCCATTCO AGGAGTTCIC TTTOTCAAGG 2940 

ACGAGCTTTT CCGAAGGATC CTTOGAAGCC 3000 

GGAGGATTCT CAACAGGTTT TCCAAAGACA 3060 

AGGCCGAGAT GTTCATCCAG AAOGTTATCC 3120 

GAGTCTTCCC GTGGTTCCTT GTGGCAGTGG 3180 

ACATTGTCTC CAGGGTCCTG ATTCGGGAGC 3240 

CTTTCCTCTC CCACATCAOG TCCAGCATAC 3300 

AAGGGCAGGA GTTTCTGCAC AGATACCAGG 3360 

TTTTGTTTAC GTGTGCX»TG OSGTGGCTGG 3420 

TCATCACCAC CACGGGGCTG ATGATOGTTC 3480 

CGGGTCTCGC CATCTCTTAT GCTGTCCAGT 3540 

TGGCATCTGA GACAGAAGCT OGATTCACCT 3600 

CTCTGTCCTT GGAAGCACCT GCCAGAATTA 3660 

AGGAGGGAGA GGTGACCTTT GAGAACGCAG 3720 
TCCTAAAGAA AGTATCCTTC AOGATCAAAC* 3780 

CAGGATCAGG GAAGTCCTCG CTGGGGATGG 3840 

GCTGCATCAA GATTGATGGA GTGAGAATCA 3900 

AACTCTCTAT CATTCCTCAA GAGCCX3GTGC 3960 

ACCCCTTCAA CCAGTACACT GAAGACCAGA 4020 

AAGAATGTAT TGCTCAGCTA CCTCTGAAAC 4080 

ACTTCTCAGT GGGGGAAOSG CAGCTCTTGT 4140 

AGATTCTGAT TTTAGATGAA GOCACAGCTG 4200 

AAGAGACCAT COGAGAAGCA TTTGCAGACT 4260 

ACACGGTTCT AGGCTCCGAT AGGATTATGG 4320 

ACACCCCATC GGTCCTTCTG TCCAACGACA 4380 

CA6AGAACAA GGTCGCTGTC AAGGGCTGAC 4440 

AGAGCATTGC CATTCCCTGC CTGGGGCGGG 4500 

GCCTTTCTOG ATTTTATCTT TOGCACAGCA 4 560 

AGGGAGAGTC ATATTTTGAT TATT6TATTT 4620 

TTTGTTCTTA ATTGCACTCT AAAAGGTTCA 4680 

CTATAATGAA GCTTTATACG TGTAGCTATA 4740 

TTACAGTGAA AATGTAAGCT GTTTATTTTA 4800 

GCATATTCCT TTCTATCATT TTTGTACAGT 4860 

AGACTGTAGG AAGAGTAGCA TTTCATTCTT 4920 

TTCTGGGTGT OCAAAGGAAO AOGTGTGGCA 4980 

GCCTCCCCAC AGCCGCTCCA GGGGTGGCTG 5040 

AGCGCCGTGA GTTCTCAGGG CTCCTGCCTT 5100 

GGAGAGCAGC GGGGGGAAGC CCAGGCCCCT 5160 

CAGAGACATT CCTCOGAGCC GGGGAGTTTC 5220 

CTAAACAAGA ATCAGTCTAT CGACA6AGA6 5280 

ACTGCACAGA GCTCTCCAGC TCCAAGACCT 5340 
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GTTGCnTCCA AGCCCTGGAG CCAACTGCTG CTTTrrGAGG TGGCRCTTTT TaVTTTGCCT S400 

ATTCCCACAC CTCCACAflTT CAGTGGCAGG GCTCAGQATT TOGTGGGTCT GTTTTCCTTT 5460 

CTCACOGCAG TOGTCGCACA GTCTCTCTCT CrcrCTCCCC TCAAAGTCTG CAACTTTAAG 5520 

CAGCTCnCC TAATCAGTGT CTCACACPCG OGTAGAAGTT TTTGTACTGT AAAGAC3UXT SS80 

ACCTCW3C3TT GCTGGTTGCT GTGTGGTTTG GTCTGTTCCC G CAAA CCCCC TTTGTGCTGT 5640 

GQGGCIGCSTA GCTCAGGlGG GOGTGGTCAC TGCTGTCATC AGTTGAATGG TCAGCGTTGC 5700 

ATGTCGTGAC CAACTAGACA TrCTGTOSCC TTAGCATGrr TGCTGAACAC CTTGTGGAAG 5760 

CAAAAATCTG AAAATGTGAA TAAAATTATT TTOGATPTTC TAAAAAAAAA AAAAAAAAAA 5820 
AAAAAAAAAA AAAAAAAA 

Seq ID SO: 194 Protein sequence-. 
Protein Accession fis IIP_005679.1 



1 11 21 31 41 SI 

JlKDIDIGKEy ilPSPGYRSV RERTSTSGTO BDSEDSKFRR TRPLECQDAL ETAARAECa^ 60 

LDASMHSQLR ILDBEHPKGK YHHGLSALKP IRTTSKHQHP VDNAGLFSCK TFSWLSSLAR 120 

VAHKKGEa:.SM EDVWSLSKHB SSUVNCSBILB JUUWQEELNOT GPDAASLRaV VWIFCRTRLI 180 

LSIVCLMITQ LAGFS6PAFM VKHLLEYTQA TESNLQYSLL LVUHiLTEI VRSWSUU^TW 240 

ALNYRTGVRL RGAILTMAFK KILKLKNIKE KSLGELINIC SNDGQaMPEA AAVGSLLACG 300 

PWAIU31IY NVIILGPTGP LGSAVFILFY PAKMPASRLT AYFRRKCVAA TDERVQKMNE 360 

VLTYIKFIKM YAKVKAPSQS VQIURESERR ILEKAGYFQG ITVGVAPIW VIASWTPSV 420 

HMMEDLTA AQAFTWTVP NSMTPAUCVT PPSVKSLSEA SVAVDSFRSI. PLMBBVHMIK 480 

KKPASPHIKI EMKSATLAWD SSHSSI<»ISP KLTPKMKKDK RASRGKKEKV RQLQRTEHQA 540 

VLABQKGHLI. LDSDERPSPB EBECKHIHIiG HLRLQRTI.HS IDLEIQEXSKL VGIOjSVGSG 600 

KTSLISAIU5 QMTLLiXSSIA ISGTFAYVAQ QAWILNATLH DNILFCKBTO EBRYNSVLNS 660 

CCLRPDLAIL PSSDLTEIGE RGANLSGGQS QRISLARALY SDRSIYIUM) PLSALDAHVG 720 

HHIWSAIRK HLKSKTVLFV THQLQYLVDC DEVIFMKEGC ITERGTHEEL MNLNGDYATI 780 

FNNLLLGETP PVEINSKKET SGSQKKSQDK GPKTGSVKKE KAVKPEEGQL VQLEEKGQGS 840 

VPWSVYGVYI QAAGGPLAFL VIMALFMUIV GSTAFSTWWL SYWIKQGSGN TTVTRGNETS 900 

VSDSMKDNPH MQYYASIYAL SMAVMLIUCA IRGWFVKGT LRASSRUJDB LFRRILRSPM 960 

KPFDTTPTGR lUIRPSKDMD EVDVRLPFQA EMFIQNVILV PPCVGMIAGV PPWPLVAVGP 1020 

LVILPSVLHI VSRVLIRELK RLDNITQSPF LSHITSSIQG LATIEAYNKG QEPLHRYQEL 1080 

LDDKQAPFFI. FTCAMRWIAV RLDLISIALI TTTGLMIVLM HGQIPPAYAG lAISYAVQLT 1140 

GLFQFTVRIA SETEARFTSV ERINHYIKTL SLEAPARIKN KAPSPDWPQB GEVTFESAEM 1200 

RYREHLPLVL KKWSPTIKPK EKIGIVGRTG SGKSSLGMAL FRLVELSGGC IKIDGVRISD 1260 

IGLADUtSKL SIIPQBPVLF SGTVRfflHiP FNQYTEDQIW DALERTHMKE CIAQLPLKLS 1320 

SEVMENGDNF SVGERQLU:! ARALUiHCKI LILDEATAAM DTETDLLIQE TIREAFADCT 1380 
MLTlAHRIJrr VLGSDRIMVL AQGQWEEDT PSVLLSNDSS RFYAMFAAAE NKVAVKG 

Seq ID NO: 195 DNA sequence 
Nucleic Acid Accession #: IIM^006470 
Coding sequence! 228.. 1922 



1 11 21 31 41 51 

GCTGTCCTCA GCCTGAGTAC TCTAGCTGCC TTGTCGCCAT CGCATCTGGC TGCCATCCAG 60 

CGCCAGCACA CAGTAAMAG OCGCCGAGCT TCCTCTGGGA GCGAGGAAAC AGTTAAAATC 120 

TTGCAGCAGC IGCAATCATC TAGGCGTGGT TCTCTTGTCT GACTTGGGCT GCACAGATCC 180 

TGGGCCAAGG GACAGAAGAA AGACAGCCTA GGAGCAGAGC CTCCCAGATG GCTGAGTTGG 240 

ATCTAATGGC TCCAGGGCCA CTGCCCAGGG CCACTGCTCA GCCCOCAGCC CCTCTCAGCC 300 

CAGACTCTGG GTCACCCAGC CCAGATTCTG GGTCAGCCAG CCCAGTGGAA GAW5AG6ACG 360 

TOGGCTCCTC GGAGAAGCTT GGCAGGGAGA CGGAGGAACA GGACAGCGAC TCTGCAGAGC 420 

AGGQGGATCX: TGCTGGTGAG GGGAAAGAGG TCCTGTGTGA CTTCTGCCTT GATGACACCA 480 

GAAGAGTGAA GGCAGTGAAG TCCTGTCTAA CCTGCATGGT GAATTACIGT GAAGAGCACT 540 

TGCAGCCGCA TCAGGTGAAC ATCAAACTGC AAAGCCACCT GCTGA CCGAG CCAGTGAAGG 600 

ACCACAACTG GC6ATACTGC CCTGCCCACC ACA6CCCACT GTCTGCTTTC TGCTGCCCTG 660 

ATCAGCAGTC CATCTGCCAG GACTGTrGCC AGGAGCACAG TGGCCACACC ATAGTCTCCC 720 

TGGATGCAGC CCGCAGGGAC AAGGAGGCTG AACTCCAGTG CACCCAGTTA GACTTGGAGC 780 

GGAAACTCAA GTTGAATCAA AATGCCATCT CCAGGCTCCA GGCTAACCAA AAGTCTGTTC 840 

TGGOGTCGGT GTCAGAGGTC AAAGOGGTGG CTGAAATGCA GTTTGGGGAA CTCCTTGCTG 900 

CTCTGAGGAA GGCCCAGGCC AATGTGATGC TCTTCTTAGA GGAGAAGGAG CAAGCTGCGC 960 

TQAGCCAGGC CAAOGGTATC AAGGCXCACC TGGAGTACAG GAGTGCOGAG ATGGAGAAGA 1020 

GCAAGCAGGA GCTGGAGAGG ATGGCGGCCA TCAGCAACAC TCrTCCAGTTC TTGGAGGAGT 1080 

ACTGCAAGTT TAAGAACACT GAAGACATCA CCTTCCCTAG TGTTTACGTA GGGCTGAAGG 1140 

ATAAACTCTC GGGCATCCGC AAAGTTATCA CGGAATCCAC TGTACACTTA ATCCAGTTGC 1200 

TGGAGAACTA TAAGAAAAAG CTCCAGGAGT TTTCCAAGGA AGAGGAGTAT GACATCAGAA 1260 

CTCAAGTGTC TGCCGTTGTT CAGCGCAAAT ATTGGACTTC CAAACCTGAG CCCAGCACCA 1320 

GGGAACAGTT OCTCCAATAT GOSTATGACA TCACGTTTGA CCCGGACACA GCACACAACT 1380 

ATCTCCGGCT GCAGGAG6AG AACCGCAAGG TCACCAACAC CAOGCCCTGG GAGCATCCCT 1440 

ACCOGGACCT CCCCAGCAGG TTCCTGCACT GGCGGCAGGT GCTGTCCCAG CAGAGTCTGT 1500 

ACCTGCACAG GTACTATTTT GAGGTGGAGA TCTTCGGGGC AGGCACCTAT GTTGGCCTGA 1560 

CCTGCAAAGG CATCX3ACCGG AAAGGGGAGG AGCGCAACAG TTGCATTTCC GGAAACAACT 1620 

TCTCCTCGAG CCICCAATGG AACGGGAAGO AGTTCACGGC CTGGTACAGT GACATGGAGA 1680 

CCCCACTCAA AGCTGGCCCT TTCCGGAGGC TCGGGGTCTA TATCGACTTC COGGGAGGGA 1740 

TCCTTTCCTT CTATGGCGTA GAGTATGATA CCATGACTCT GGTTCACAAG TTTGCCTGCA 1800 

AATTTTCAGA ACCAGTCTAT GCTGCCTTCT GGCTTTCCAA GAAG6AAAAC GCCATCCGGA 1860 

TTGTAGATCT GGGAGAGGAA CCCGAGAAGC CAGCACCGTC CTTGGGGGTG ACTGCTCCCT 1920 

AGACTCCAGG AGCCATATCC CAGACCTTTG CCAGCTACAG TGATGGGATT TCCATTTTAG 1980 

GGTGATTTGT GGGCAGAAAT AACTGCTGAT GGTAGCTGGC TTTTGAAATC CTATGGGGTC 2040 

TCTGAATGAA AACATTCTCC AGCTGCTCTC TTTTGCTCCA TATGGTGCTG TTCTCTATGT 2100 

GTTTGCAGTA ATTCTTTTTT TTTTTTTTGA GAOGGAGTCT OGCACTGTTG CCCAGGCTGG 2160 

AGAGCAGTGG CGCGATCTTC GCTCACTGCA AGCTCOJOCT CCOGAGTTCA AGCAATTCTC 2220 

CTGCCTCAGC CTOXGAGTA GCTGGGATTA CAG6TGCCTG CCACCACACC CAGCTAATGT 2280 

TrTCyrATTTT TAGTAGAGAT QGGGTTTCAC CATGTTGGOC AGGCAGATCT CAAACTCCTG 2340 



266 
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ACCTCXTTGAT GCA0CX3VCCT OQGOCTCXXZA AAGTGCTGGG ATTACATGOG TGAGCCACTG 2400 

GGCCCTGCCX GTTTGTAGTA ATTTTTACGC ACCAAATCTC CCTCATCrrC TAGTGCCATT 2460 

CrCCTCTCIQ TTCAGGTAAA TGTGACACTG TGOCCAGAAT GGATGACCAG GAACCTTAAA 2520 
GAGTGGCTGA AAAGATTGCA GAGTTATCAT AATAAATTGC TAACTTGOGT 

Seq 10 NO: 196 Protein sequence: 
Protein Accession ft: NP_00646l 

1 11 21 31 41 51 

I i I I t I 

KAEU9U4APG PXtPRATAQPP APLSPDSGSP SPDSGSASPV EEEDVGSSEK USRETEEQDS 60 

DSAEQGDPAG EGKBVLCDFC LDDTTRRVKAV KSCLTCMVNY CEEHLQPHQV NIKIiQSHLLT 120 

EPVKDHNWRY CPAHHSPLSA FCCPDQQCIC QDCCXJEHSCT TIVSLDAARR OKEAELQCTQ 180 

LDIiERiCLKLN ENAISRLQAN QKSVIiVSVSE VKAVAEMQFG ELLAAVRKAQ ANVMLFLBEK 240 

EQAALS0AK6 IKAHItEYRSA S4EKSKQELB RMAAISHTVQ FLESYCKFKM TEDITFPSVY 300 

VGUCDXLSGI RJCVITESTVH liIQLLENYKK KLQEFSKEEE YDIRTQVSAV VQRKYWTSKP 360 

BPSTREQPLQ YAYDITFDPD TAHKYLRLQE EMRICVTOTTP WBHPYPDLPS RFLHWRQVIiS 420 

QQSLYLHRYY PEVEIFGAGT WGIiTCKGID RXGEERNSCI SGilNFSWSLQ WUGKEPTAWY 480 

SDMETPliKAG PFHRLGVYID PPGGILSFYG VBYDTtfTLVK KFACKFSEPV YAAFWItSKKS S40 
NAIRIVDLGE EPEKPAPSLG VTAP 

Seq ID NO: 197 DKA sequence 
Kucleic Acid Accession i: 2m_0043l6 
Coding sequence: 433-1149 

I - 11 21 31 41 51 

1 I I I I ) 

CCC6AGAG0C GGGGCAAGAG AGOGCAGCCT TAGTAGGAGA GGAACGCX»AG ACX3CGGCAGA 60 

GCX3GGTTCAG CACTGACTTT TGCTGCTGCT TCTGCTTTTT TTTTTCTTAG AAACAAGAAG 120 

GOGCCAGOGG CAGCCTCACA CG0GAGC6CC ACGCGAGGCT CCOGAAGCCA ACCOG CGAAG 180 

GGAGGAGGGG AGGGAGGAGG AGGCXSGCGTG CAGGGAGGAG AAAAAGCATT TTCACCTTTT 240 

TTOCTCCCAC TCTAAGAAGT CTCCCXSGGGA TTTTGTATAT ATTTTTTAAC TTOOGTCAGG 300 

GCTCCCGCTT CATATTTCCT TTTCTTTCCC TCTCTGTTCC TGCACCCAAG TTCTCTCTGT 360 

GTCCCCCTOG CGGGCCCCGC ACCTCGCGTC CCX5GATCGCT CTGATTCCGC GACTCCTTGG 420 

CCGCCGCTGC GCATGGAAAG CTCTGCCAAG ATGGAGAGGG GCGGOGCOGG CCAGCAGCCC 480 

CAGCCGCAGC CCCAGCAGCX: CTTOCTGCCG CCOGCAGCCT GTTTCTTTGC CAOGGCCGCA 540 

GCCGCGGCGG CCGCAGCCGC OGCAGCGGCA GCGCAGAGOG CGCAGCAGCA GCAGCAGCAG 600 

CAGCAGCAGC AGCAGCAGCA GCAGGCGCCG CAGCTGAGAC OGGCGGCCGA CGGCCAGCCC 660 

TCAGGGGGCG GTCACAAGTC AGCGCCCAAG CAAGTCAAGC GACAGCGCTC GTCTTCGCCC 720 

GAACTGATGC GCTGCAAACG CCGGCTCAAC TTCAGOSGCT TTGGCTACAG CCTGCCGCAG 780 

CAGCAGC06G COGCOGTGGC GCGCCGCAAC GAGCGCGAGC GCAACCGCGT CAAGTTGGTC 840 

AACCTGGGCT TTGCCACCCT TOGGGAGCAC GTCCCCAACG GCGCGGCCAA CAAGAAGATG 900 

AGTAAGGTGG AGACACTGCG CTCGGCGGTC GAGTAGATCC GCGCGCTGCA GCAGCTGCTC 960 

GACGAGCATG AOGCGGTGAG OGCCGCCTTC CAGGCAGGGG TOCTGTOGCC CAC CATCTC C 1020 

CCCAACTACT CCAAOGACTT GAACTCCATG GC06GCT0GC CGGTCTCATC CTACTOGTCG 1080 

GACGAGGGCT CTTAOGACCC GCTCAGCCCC GAGGAGCAGG AGCTTCTCGA CTTCACCAAC 1140 

TGGTTCTGAG GGGCTCGGCC TGGTCAGGCC CTGGTGCGAA TGGACTTTGG AAGCAGGGTG 1200 

ATGGCACAAC CTGCATCTTT AGTGCTTTCT TGTCAGTGGC GTTGGGAGGG GGAGAAAAGG 1260 

AAAAGAAAAA AAAAGAAGAA GAAQAAGAAA AGAGAAGAAG AAAAAAAOGA AAAGAGTCAA 1320 

CXZAACCCCAT CGCCAACTAA GCGAGGCATG CCTGAGAGAC ATGGCTTTCA GAAAAOGGGA 1380 

AGCGCTCAGA ACAGTATCTT TGCACTCCAA TCATTCAOGG AGATATGAAG AGCAACTGGG 1440 

ACCTGAGTCA ATGCGCAAAA TGCAGCTTGT GTGCAAAAGC AGTGGGCTCC TGGCAGAAGG 1500 

GAGCA6CACA CGCGTTATAG TAACTCCCAT CACCTCTAAC ACGCACAGCT GAAAGTTCTT 1560 

GCTOSGGTCC CTTCACCTOC COGCCCTTTC TTAGAGTGCA GTTCTTAGCC CTCTAGAAAC 1620 
GAGTTGGTCT CTTTC 



Seq ID NO: 198 Protein sequence: 
Protein Accession §: NP_004307 

1 11 21 31 41 SI 

i I I I I I 

MESSAKMESG GAOQQFQPQP QQPFLPPAAC PFATAAAAAA AAAAAAAQSA QQQQQQQQQQ 60 
QQQQAPQLRP AADGQPSGGG HKSAPKQVKR QRSSSPBLMR CKRRLNFSGF GYSLPQQQPA 120 
AVARRNERER NRVKLVNLGF ATLREHVPNG AANKKMSKVE TLRSAVEYIR ALQQLLDEHD 180 
AVSAAFQAGV LSPTISPNYS NDLNSMAGSP VSSYSSDBGS YDPLSPEEQE LLDFTHWF 

Seq ID NO: 199 DMA sequence 
Kucleic Acid Accession ft: NM_00701S 
Coding sequence: 1-1005 

1 11 21 31 41 51 

I 1-1 I i I 

ATGACAGA6A ACTCCGACAA AGTTCCCATT 6CCCTGGTGG GACCTGA7GA CGTGGAATTC 60 

TGCAGGCCCC CGGCGTAC6C TACGCTGAGG 6T6AAG0CCT CCA6CCCCGC GGGGCTGCTC 120 

AAGGTGGGAG CCGTGGTCCT CATTTCGGGA GCTGTGCTGC TGCTCTTTGG GGCCATOGGG 180 

GCCTTCTACT TCTGGAAGGG GAGCGACAGT CACATTTACA ATGTCCATTA CACCATGAGT 240 

ATCAATGGGA AACTACAAGA TGGGTCAATG GAAATAGACG CTGGGAACAA CTTGGAGACC 300 

TTTAAAATGG GAAGTGGAGC TGAAGAAGCA ATTGCAGTTA ATGATTTCCA GAATGGCATC 360 

ACAGGAATTC GTTTTGCTGG AGGAGAGAAG TGCTACATTA AAGOGCAAGT GAAGGCTCGT 420 

ATTCCTGAGG TGGGCGCOGT GACCAAACAG AGCATCTCCT CCAAACTGGA AGGCAAGATC 480 

ATGCCAGTCA AATATGAAGA AAATTCTCTT ATCTGGGTGG CTGTAGATCA G CCTGTG AAG 540 

GACAACAGCT TCTTGAGTTC TAAGGTGTTA 6AACTCTGGG GTGACCTTCC TATTTTCTGG 600 

CTTAAACCAA CCTATCCAAA A6AAATGCAG AGGQAAAGAA GAGAAGTGGT AASAAAAATT 660 

GTTOCAACTA CCACAAAAAG ACCACACAGT GGACCA0G6A GCAAOOCAGG CGCTGGAAGA 720 

CTGAATAATG AAACCA6ACC CAGTGTTCAA GAGGACTCAC AAGCCTTCAA TCCIGATAAT 780 
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CCTTATCATC AGCAGGAAGG GGAAAGCATG ACATTOSAOC CTAGftCTOGH TCAO GAAGSA . 840 

ATCTOTTCTA TAGAATCTAG GOSGAGCTAC ACCCACTGCC AGAAGATCIG TGAAOCCCTG 900 

GGGGOCTATT ACCCATGGCC TTATAATrAT CAAGGCTGCC GTTOtSGCCTG CAGAGTCATC 960 

ATGC3CATGTA GCTGGTGGGT GGCCCGTATC TTGGGCATGG TGTGAAATCA CTTCATATAT 1020 

CACGTGCTGT AAAATAAGAA CTAGCTGAAG AGACAACCAA AGAAGCATTA AGGCAGGTTG 1080 

ATGCTGATGG GACCATAAAA TATTTTTACa OGCftGCCTGA GOGGTTATTC TTGACACTCT 1140 

TAACAGAATT TTTTrAATCG TTTTCCaGAA CTTTAGTATA TGCAA ATGCA CTGAAAGGCT 1200 

AGTTCAAGTC TAAAATGCCA TAAOCCOGTT ATTTCTTATT TTTTArnGC ATTCATTTGC 1260 

CATAAGTCrr CCCrroCTTC CATCTTCCAA AGCTATTTOG AAATAAACAC GAAAATTTAC 1320 
AGTTTGCC 



60 
120 
180 



60 
120 



Seq ID HO: 200 Protein sequence: 
Pcotein Accession »: llPJ)08946 

1 11 21 31 41 51 

MTENSDKVPI ALVGPOOVEP CSPPAYATLT VKPSSPAHLL KVGAWLISG AVLLLFGAIG 

afyfwkgsds hiynvhytms ingklqdgsm eidagsslbt fkmgsgaeea iawdf^i 

TCIRFAGGEK CYIKAQVKAR IPEVGAVTKQ SISSKLEGKI MPWOffiEMSL IHVAVDQPVK 

DMSPLSSKVL ELCGDLPIPW UCPTYPKEIQ RERRBWRKI VPTTTKRPHS GPRSNPGff ^ ^40 

^BTRPSVQ EDSQAFNPDN PYHQQEGESM TTOPRUJHEG lOClECRRSY THOQKICEPL 300 

GGYYPWPYNY QGCRSACRVI MPCSWWVARI LGMV 

Seq ID HO: 201 DNA sequence 
nucleic Acid Accession It NMJ)00728.2 
Coding sequence: 112.. 4 95 

1 11 21 31 41 51 

GTAATAAGAG OGGGGTCTCC GCGGGGAAGG CGCCCACAGC AGGTGTGGTG TTCATCCCGG 
GTCGACCGGC CGCTCGCGCT GCCCTCAAAC TCTAGTOOOC AGAGAGG0G6 CATGGGTTTC 

CGGAAGTTCT CCCCCTTCCT GGCTCTCAGT ATCTTGGTCC TGTACCAGGC GGGCAGCCTC 180 

CAGGOSGOGC CATTCAGGTC TGCCCTGGAG AGCAGCCCAG ACCCGGCCAC ACTCA6TAAA 240 

SSSoOGC GOCTOCreCT GGCIGCACTG GTCCAGGACT ATGTGCAGAT GAAGGCCAOT 300 

GAGCTGAACC AGGAGCAGGA GACACAGGGC TCCAGCTCOG CTGCCCAGAA GAGAGCCIGC 360 

AaScTCCCA CCTGTGTGAC TCATCGGCIG GCAGGCTTGC T6RCCAGATC AGGGGGCATG 420 

^TGAAGAGCA ACTTOGTGCC CACCAATGTG GGTTCXAAAG CCTTTGGCAG GC6CCGCAGG 480 

GACCTTCAAG CCTGAGCAGA TGAATCACTC CAGGAAGAAG GTGTGTCCTA AATCCAATCA 540 

CATATCCTTA TAAGAGATTC ACTCAGAAGA CACATGTGGA GAAGGTGACA TGACAGAGGC 600 

AAGGAGGCAC AAGCCAAGGA AGTCTGTGTC TACCAGAAGC CAGAATCACA GAACAGTCTC 660 

TGGAAGAAGA GCftCCCCTGC TCACACCTAG AGTTTGGACT TCCAGCTTCC AGAACTGTGA 720 

GAGAATAftTT TCIGTTGTTT TAAGCCACAA AGTTTGTGGT AATTTGTTAT GACAGCCCTA 780 

GGAAACTAAT ACAATACATT TTCATTTATT TTGGGTAAAT GCCTTGGAGT GGGATTGCTG 840 

GGTTATTTCG AAAGTGTGTA TTTAACTCTG TAAGAAACTG OCAAACTATT TTCTGAAGTG 900 

ACTGTACCAC TTCGCCTTCT TGCCAGCCAC ATATGAGAGC TCTAGTATTT CCACAAATAG 960 

GTATGTAGCA GTATCTCATT GCTOTTTTAA TTTGTATTTC CCCAATGACT AATGACGTTG 1020 

AGCATCTATT TTACCATATG TTTATCACCT TTATTGAAGG GTCTGTTTAA ATCTTCTGCT 1080 

AAATTTTTGT TGGCTTGCTT GCTTTATTAG TGTTGAGTTT TTAGAGCTCT TTATATGTTG 1140 

TGGATGCAAG ATTGTTTTCA GATATATAGT TTGGAAACTT CCTTOCCCTG AATCTGCGGA 1200 

TTGCTTTTTC ATTTTCTTAG CAGTGTCTCT CACAGA6AAA AAGTTGTAAT TTGAATAAGA 1260 

TCCAATTCAT CTTTTTTTTT CTTTTATGTA TTGTGCTTTT AGTTCATGTC TAAGAACTCT 1320 

TTGCCTAACT AAGGTCCCAA GGTCACAATA ACCTTATTCT ATACTTTCTT GTAAAAGTTT 1380 

TATAGmTA TATTTTATAT GTAGATTAGT GATCTATTTT GAGTTAATTr TTGTATAAGG 1440 

TCAGAGGTGT AGGTTGAAAT TCATACCTGT GAATATAGAT ACCCAATTGT TTCAGTGCCA 1500 

TTTCTTAAAA AGACTGTTAT TTCACCATTT AATTGCCCCT GCACCTTTGT CAAAAAGCAA 1560 

CTGATCATAT TTGTGTGGGT ATATTTCTGG GTTCTCAATT CTGTCTCATT GATTGATTTO 1620 

ACCATTCTTT TGCCAATGTC ATACTGCCTT GATTAGTGTA GTGTTAAAGT GAATCTCAAA 1680 

ACCAGATAAT GTGGGTCTAC CAACATTGTT CATTCTTGTT CAAAAAGATT TTAGCTACAT 1740 

CTAAAATATT TTCTACATCT TTTATACATT TTAGAATCAG TGTGT TACTA TCTACAAAAT 1800 

TTCTGATGAG ATTTTTAATG GGATTGTGTT AAATCAGTGG GTTAATTTTG GGAGAATTAG 1860 

CATATTAATA ATATTAAGTC GTTCAATTCA TGAACACAAT ACATGTTTTC ACTTATTTAG ^^''^ 
GTTTTCTCTG TTTTTTTTTT ITTAACAGTG TTCTCAGTTT TCAACAGAAA TATTCTACAC 
ATATCTTGrP AGATTTTTAA CTATTTTATT TTTTGGTGCT AATGTAAATG GTACTTAAAC 

ATTTTTCTTT TTAATTGTTC ATTGCTAGTA GATAGAAATA CAATATTTAA AATATTAGGA 2100 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 

Seq ID NO: 202 Protein sequence: 
Protein Accession *: NP_000719.1 

1 11 21 31 41 51 

MGFRKFSPFL ALSILVLYQA GSLQAAPFRS ALESSPDPAT LSKEDARLLL AALVQDYVQM 60 
KASELKQSQE TQGSSSAAQK RAC3ITATCVT HRLAGLLSRS GGMVKSNPVP TNVGSKAPGR 120 
RRSDLQA 

Seq ID HO: 203 DNA sequence 
Nucleic Acid Accession NM_00174l 
Coding sequence: 71.. 4 96 

1 11 21 31 41 51 

CTCTGGCTGG ACGCCGCCGC CGCCGCTGCC ACCGCCTCTG ATCCAAGCCA CCTCCOGCCA 60 

GAGAGGTGTC ATGGGCTTCC AAAAGTTCTC CCCCTTCCTG GCTCTCAGCA TCPTGGTCCT 120 

GTTCCAGGCA GGCAGCCTCC ATGCAGCACC ATTCAGGTCT GCCCTGGAGA GCAGCCCAGC 180 

AGACCCGGCC ACGCTCM5TG AGGACOAAGC GCGCCTCCTO CTGCSCTGO^ ISSSSS^ III 

CTATOTGCAG ATGAAOQCCA GTGAGCTGGA GCAGGAGCAA GAGAGAGftGG GCTCCAGOCT 300 
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GGACAGCCCC AGATCTAAGC GGTGCGGTAA TCTGAGTACT TGCATGCTGG GC3VCATACAC 
GCAiSGACTTC AACAAGTTTC ACAOGTTCCC CCAAACTGCA ATTGGGGTTG GAGCACCTGG 
AAAGAAAAGG GATATGTCCA GCGACTTGGA GAGAGACCAT CGCCCTCATG TTAGCATGCC 
CCAGAAT6CC AACTAAACTC CTCCCTTTCC TTCCTAATTT CCCTTCTTGC ATCCTTCCTA 
TAACTTGATG CATGTGGTTT GGTTCCTCTC TGGTGGCTCT TTGGGCTGGT ATTGGTGGCr 
TTCCTTGTOG CAGAGGATGT CTCAAACTTC AGATGOGAGG AAAGAGAGCA GGACTCACAG 
GTTGGAAGAG AATCACCTGG GftAAATACCA GAAAATGAGG GCCGClTlliA GTCCCCCAGA 
GATGTCATCA GAGCTCCTCT GTCCTGCTTC TGAATGTGCT GATCATTTGA GGAATAAAAT 
TATTTTTOOC C 



Seq ID NO: 204 Protein sequence: 
Protein Accession ft: KP_001732 



PCT/US02yi2476 



41 



51 



1 11 21 31 

I i I.I 1 I 

HSFQXFSPFXi ALSILVLLQA GSLHAAPPRS ALESSPADPA TLSEDSARLL LAALVQDYVQ 
MKASEZiBQBQ EREGSSLDSP RSKRGQILST GKLGTYTODP NKFHTFPQTA IGVGAPGKKR 
DMSSDLEROH RFSVSMPQNA N 

Seq ID NO: 205 DNA sequence 
Nucleic Acid Accession #: NM_00536i 
Coding sequence: 1-94S 



1 

1 

ATGCCTCTTG 
GAGGCCCTGG 
TCCTCTTCTA 
CCrCCOCACA 
AGACAATCOG 
CTGGAGTCOG 
CTCCTCAAGT 
AGAAATTGCC 
GTCTTTGGCA 
TGCCTGGGCC 
dCCTGATAA 
ATCTGGGAGG 
CATCCCAGGA 
GTGCCCGGCA 
ACCAGCTATG 
TACC5CACCCC 



11 
I 

AGCAGAGGAG 
GCCTGGTGGG 
CTCTAGTGGA 
GTCCTCAGGG 
ATGAGGGCTC 
AGTTCCAAGC 
ATCGAGCCAG 
AGGACTTCTT 
TCGAGGTGGT 
TCTCCTACGA 
TCGTCCTGGC 
AGCTGAGTAT 
AGCTGCTCAT 
GTGATCCTGC 
TGAAAGTCCT 
TGCATGAACG 



21 
I 

TCAGCACTGC 
TGCGCAGGCT 
AGTTACCXITG 
AGCCTOCAGC 
CAGCAACCAA 
AGCAATCAGT 
GGAGCXX3GTC 
TCCCGTGATC 
GGAAGTGGTC 
TGGCCTGCTG 
CATAATOGCA 
GTTGGAGGTG 
GCAAGATCTG 
ATGCTACGAG 
GCACCATACA 
GGCTTTGAGA 



31 
I 

AAGCCTGAAG 
CCPGCTACT6 
GGGGAGGTGC 
TTCTOGACTA 
GAAGAGGAGG 
AGGAAGATGG 
ACAAAGGCAG 
TTCAGCAAAG 
CCCATCApCC 
GGGGACAATC 
ATAGAGGGOG 
TTTGAGGGGA 
GTGCAGGAAA 
TTCCTGTGGG 
CTAAA6AT0G 
GAGGGAGAA6 



41 
I 

AAGGCCTTGA 
AGGAGCAGCA 
CT6CTGCCGA 
CCATCAACTA 
GGCCAAGAAT 
TTGAGTTGGT 
AAATGCTGGA 
CCTCOGAGTA 
ACTTGTACAT 
AGGTCATGCC 
ACTGTGCCGC 
GGGAGGACAG 
ACTACCTG6A 
GTCCAAGGGC 
GTGGAGAACC 
AGTGA 



51 

i 

GGCCCGAGGA 
GACCGCTTCT 
CTCACCGAGT 
CACTCTTTGG 
GTTTCCCGAC 
TCATTTTCTG 
GAGTGTCCTC 
CTTGCAGCTG 
CCTTGTCACC 
CAA6ACAGGC 
76AGGAGAAA 
TGTCTTOGCA 
GTACCGGCAG 
CCTCATTGAA 
TCACATTTCC 



Seq 10 NO: 206 Protein sequence: 
Protein Accession 8: ZIP_005352 



11 
t 



21 



MPLEQRSQKC KPEEGLEARG EALGLVGAQA 
PPHSPQGASS PSTTINYTLH RQSDEGSSNQ 
LLKYRAREPV TKAB4IiBSVL RNCX2DFFPVI 
CLGLSYDGIiL GDNQVMPKTG IiLIXVIiAIIA 
HPRKIjXiMQDL VQENYLEYRQ VPGSOPACYE 
YPPIiHERALR EGEE 



31 41 SI 

1 i I 

PATEBQQTAS SSSTLVEVTL GEVPAADSPS 
EBB6PRMFPD LESSFQAAIS RKMVELVHFL 
FSKASEYLQL VFGIEWEW PISHLYILVT 
IBGDCAPEEK IWEELSMLEV PEGREDSVPA 
FLWGPRALIE TSYVKVLHBT LKZGGEPHIS 



Seq ID NO: 207 DNA sequence 
Nxxcleic Acid Accession #: 1IM_021115 
Coding sequence: 743-2893 



1 
I 

AAAGGAAGGG 
GGCACOCsGCC 
CCCAAACTAA 
CCCTTTGGGT 
GCACCCTGAA 
GGGCGAGCTG 
AOCGCTGCTT 
TTCGCrCAAG 
CACTGTCCAA 
CACGGAGAAG 
A6AAGT6CCC 
GCAAATCTCC 
ACCOGGGGAG 
GGCCCTGATG 
GACCACTACC 
CTGCAGTGTG 
GCCCCTCAAC 
GGAGCrCCAG 
GGAOGGCCCT 
CCGAAGCCCC 
GACCTTCCAG 
CTCTGGGGAT 
CCTGGCCTAT 
CTGGAGCAGC 
CAT0G60CGC 
CrGGACGATT 



11 

1 

AGGGAGGGAG 
TTAGGAGGGC 
CTGGTGTCTT 
CCTTACCTCC 
GAGAGAGTGG 
GTGCTGGATG 
CCAGAGGAGG 
CAGGTGAACT 
AGGGCAGGGT 
CCTGGCCCAC 
CTTTGGCTGG 
CCCTTCACTT 
CCTGGGCCTG 
GACAAAGGTG 
TOCACCATTA 
AGCTTCTCCA 
AACTTTCTGQ 
GTGAAGAGTG 
ACCCTGACCG 
ACCAACACCA 
CTTCACTACC 
GTCACGGTGA 
GAGCTCCAGG 
CAGGAGCCCA 
GTCCTCTCCC 
GAAGCTCCAG 



21 
1 

AAAGGAGAAG 
CACCCTCAGA 
TTCTCCTCTT 
TGCCCTCAGG 
TAACAGCGCC 
GGACOGCACC 
CCCGCCCCAA 
CTGCCAGGAA 
CCCAGCCAGC 
GGGGGGACCC 
ACOGAAAGGA 
CGCAGCCCTA 
ACATGGCCCA 
AGAATGAGCT 
TCACCACCAC 
ATCCTGAGGG 
AGTGCACATA 
TGAACCTGTC 
TCCTGGCCAA 
TCTCXX3TCTA 
AGGCCTTCAT 
TCGACCTGCA 
GCGCTAAGAT 
TCTGCTCAGC 
CAAGTTACCC 
AGGGCCAGAA 



31 

I 

TTGGTTTAGA 
GTCTGACAGC 
CCAAGATGCT 
AGCCCCGGAG 
CCCCAGTTCC 
CTCTGCACAT 
GCACGCCTTG 
GCAGCTGAGG 
GTCGCAGGGC 
GGAOCOCATC 
GAGTGCGGTC 
TGTGGCCCAC 
GGAGGCCCCC 
GACTGGGTCA 
GGTCATCACC 
GTACATTGAC 
CAACGTGACA 
CGATGGGGAA 
CCAGACACTC 
CTTCOGGACC 
GCTGAGCTGC 
CTCAGGTGGG 
GCTGACATGC 
TCCTTGTGGA 
TGRAAACACA 
GCTGCACCTG 



41 
I 

GGCCAGCCGG 
AGGTGAAGGT 
dTCCGGAGG 
AGAGGCAGTC 
TCACAGTCGG 
CACGACATCC 
CCCCCCAAGA 
CCCAAGGCCA 
CTAGATCTCC 
GTGGCCTCCG 
CCTACAACAC 
ACACTCCCCC 
CAGGAGGACA 
GCCTCAGAGG 
AC0QA6CAG6 
TCCAGCGACT 
GTCTACACTG 
CTGCTCTCCA 
CTGGTGGAGG 
TTCCAGGAOG 
AACTTTCCCC 
GTGGCCCACT 
ATCAATGCCT 
GGGGCAX3TOC 
AATGGGAGCC 
CACTTTGAGA 



51 
1 

ACGAfiCTTTG 
CCTAAATCTC 
GA6ATGCTAG 
CTGGCAAAGA 
CGGAAGTGCT 
CAGCCCTGTC 
AGAAACTGCC 
CCTOCGCAGC 
TCTCCTCCTC 
AGGAGGCATC 
CCGCACCCCT 
AGAGGCCAGA 
CCAGCCCCAT 
AGAGCCA6GA 
CACCAGCTCT 
ACCCACTGCT 
GCTATGGGGT 
TCCGCGGGGT 
GGCAGGTAAT 
ACGGCCTTGG 
GCCGGCCTGA 
TTCACT6CCA 
CX»AGC06CA 
ACAATGCCAC 
AAITC7GCAT 
GGCTGTTGCT 



360 
420 
480 
540 
600 
660 
720 
780 



€0 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
€60 
720 
780 
840 
900 



60 
120 
180 
240 
300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
ISOO 
1560 
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GCATGiVCAAG GACAGGAtGA OSGTTCACAG Og3GC»CACC AACAAGTOUS CTCnCrCTA 1620 

CGACTCCCTT CAAAOCQAGA G ' lVrrCUem ' TGAGGGCCTG CTSACGGAAG GCAACAOCAT 1680 

COGCATOGAG TTCACGTCOG ACCAGGCCOG GGCQGCCTCC ACCTTCAACA TCOGATTTGA 1740 

AGOGTTTGAG AAAGGCCACT GCTAIGAGOC CTACATCCAG AATGGGAACT TCACTACATC 1800 

G6ACC0GACC TATAACATTG 06ACTATAGT GGAGTTCACC TGCGAOCCOG GCCACICCCT 1860 

G6AGCAGGGC OQGGOCAtCA TCGAATGCAT CAAtCTGOSG GACCCA3ACT GGAAIGACAC 1920 

AGA6CO0CTG TGCAGAGCCA TGTGTGGTGG QGAGCTCTCT GCTGTGGLTG GGGTGGTATT 1980 

GTCCCCAAAC TGGCCCGAGC CCTAOGTGGA AGGTGAAGAT TGTATCTGGA AGATCCACGT 2040 

GGGAGAAGAG AAACGGATCT TCTTAGATAT CCAGTTCCTG AATCTGAGCA ACAGTCaCAT 2100 

CTTGAGCATC TACGATGGOG ACGAGGTCAT GCXXCACATC TTGGGGCAGT ACCTTGGGAA 2160 

CASTGGCCXX: CA6AAACTGT ACTCCTCCAC GCCAGACTTA ACCATCCAGT TCCATTOQGA 2220 

COCTGCTGGC CTCATCTTTG GAAAGGGCCA GGGATTTATC ATGAACTACA lAGAGGTATC 2280 

AAGGAATGAC TCCTGCTCQG ATTTACOCX3A GATCCAGAAT GGCTGGAAAA CCACT TCTCA 2340 

CAOGGAGTTG GTGCGGGGAG CCAGAATCAC CTAOCAGTOT GACOOOGGCT ATGACATGGT 2400 

GGGGAGTGAC ACCCTCACCT GCCAGTGGGA CCTCAGCTGG AGCAGOGAOC COCCATTTTG 2460 

TGAGAAAATT ATGTACTGCA CCGACCCCGG AGAGGTGGAT CACTOGACCC GCTTAATTTC 2520 

GGATCCTGTG CTGCTGGTGG GGACCACCAT CCAATACACC TGCAAC3C00G GTTTTGTGCT 2580 

TGAAGGGAGT TCTCTTCTGA CCTGCTACAG CCGTGAAACA GGGACTCCCA TCTOGAOGTC 2640 

TOGCCTGCCC CACTGCGTTT CAGAAGOQGC AGCAGABAOG TOGCTGGAAG GGQGGAACAT 2700 

GGCCCTGGCT ATCTTCATCC OGGTCCTCAT CATCTCCTTA CTGCT6GGAG GAGCCTACAT 2760 

TTACATCACA AGATGTCGCT ACTATTCX:AA CCTCCXSCCTG CCTCTGATCT ACTCCCACCC 2820 

CTACAGCCAG ATCACCGTGG AAACOGAGTT TGACAACCCC ATTTACGAGA CAGG GGGAAC 2880 

CCAAAAGGTT TAGGGTTTCA TTTAAAAAGA GGTACCCTTT AAAAAGGGGC TTGTGAACTC 2940 

AAOCCXaATT TCCCGGAGAC ATTTATCCAA AGGCCCTQGG GGOCT TGATT TAAACOCOCA 3000 

AAAGGOGGCT GTTTTTTGGT TAAACTTTTT AACAAAGGGT TA0EK3GTTTT TTCCCOGGAT 3060 
rrXATAAATT TTAAAAGTG 



Seq ZD NOi 208 Protein sequence: 
Protein Accession ft: 1IP_066938 

1 11 - 21 31 41 SI 

I I I I ) i 

MAQEAPQBDT SPMAUflDKGE NELTGSASEE SQETTTSTII TTTVITTEQA PALCSVSFSN 60 

PEGYIDSSDY PIiIJ>UINFLE CTYNVTVYTG YGVELQVKSV NLSDGELLSI RGVDGPTLTV 120 

LANQTLLVEG QVIRSPTNTI SVYFRTFQDD GLGTFQIiHYQ AFMLSCNFPR RPDSGDVTVM 180 

DLKSGGVAHF HCHLGYELQG AKMLTCINAS KPHWSSQEPI CSAPCGGAVH NATIGRVLSP 240 

SYPENTHGSQ FCIWTIEAPE GQKLHLHFER LLUTOKDRMT VHSGQTNKSA XiYDSLQTES 300 

VPFEGLLSE6 NTIRIEFTSD QARAASTFNI RPEAPEKGHC yEPYIQNGMP TTSDPTVNIG 360 

TIVEFTCDPG HSLEQGPAII ECINVRDPYW KDTEPLCRAM OSGBIiSAVAS WLSPNWPEP 420 

yVEGEDCIWK IHVGEEKRIP LDIQFLNLSN SDILTIYDGD EVMPHILGQY ZX2ISGPQKLY 480 

SSTPDLTIQF BSDPAGLIPG lOSQGFIMNYI EVSRNDSCSO LPEIQNGWKT TSHTELVRGA 540 

RITYQCDPGY DIVGSDTLTC QIVDLSWSSDP PFCEKIMYCT DPGEVDHSTR LISDPVLLVG 600 

TTIQYTCNPG FVLEGSSLLT CYSRETGTPI HTSRLPHCVS EAAAETSLBG GNMALAIFIP 660 
VIiIISLLLGG AYIYITRCRY YSNLRLPLMY SHPYSQITVE TEFDNPIYET GGTQKV 

Seq ID NO: 209 DMA sequence 

Nucleic Acid Accession U-. NM_001327.1 

Coding sequence: 89-631 

1 11 21 31 41 51 

I I I I I ) 

AGCAGGGGGC GCTGTGTGTA COaVGAATAC GAGAATACCT OGTGGGCCCT GACCTTCTCT €0 

CTGAGAGCCX3 GGCAGAGGCT CCGGAGCCAT GCAGGCCGAA GGCCGGGGCA CAGGG6GTTC 120 

GACGGGCGAT GCTGATGGCC CAGGAGGCCC TGGCATTCCT 6ATGGCCCAG GGGGCAATGC 180 

TGGCGGCCCA GGAGAGGCGG GTGCCACGGG CGGCAGAGGT CCCCGGGGCG CAGGGGCAGC 240 

AAGGGCCTCG GGGCCXSGGAG GAGGOGCCCC GCGGGGTCCG CATGGCGGCG CGGCTTCAGG 300 

GCTGAATGGA TGCTGCAGAT G06GGGCCAG GGGGCCG6A6 AGCCGGCTGC TTGAGTTCTA 360 

CCTCGCCATG CCTTTCGCGA CACCCATGGA AGCAGAGCTG GCCC6CAGGA GCCTGGCCCA 420 

GGATGCCCCA CCGCTTCCCG TGCCAGGGGT GCTTCTGAAG GA6TTCACTG TGTCCX3GCAA 480 

CATACTGACT ATCCGACTGA CTGCTGCAGA CCACCGCCAA CTGCAGCTCT CCATCAGCTC 540 

CTGTCTCCAG CAGCTTTCCC TGTTGATGTG GATCACGCAG TGCTTTCTGC CCGTGTTTTT 600 

GGCTCAOCCT CCCTCAGGGC AGAGGCGCTA AGCCCAGCCT GGCGCCCCTT CCTAGGTCAT 660 

GCCTCCTCCC CTAGGGAATG GTCCCAGCAC GAGTGGCCAG TTCATTGTGG GGGCCTGATT 720 
GTTTGTCGCT GGAGGAGGAC GGCTTACATG TTTGTTTCTG TAGAAAATAA AACTGAGCTA 

Seq ID NO: 210 Protein sequence: 
Protein Accession 8: NP_001318.1 

1 11 21 31 

11)1 
MQAEGRGTGG STGOAOGPGG PGIPDGPGGN AGGPGEAGAT 
PRGPHGGAAS GLHGCCRC6A RGPESRLLEF YLAMPFATPM 
VLLKEFTVSG NXLTIRLTAA DKRQIiQLSIS SCLQQLSI»LH 



Seq ID KG: 211 DNA sequence 

Kucleic Acid Accession #: Eos sequence 

coding sequence: 52-459 

I 11 21 31 

I i i I 

CCTCGTGGGC CCTGACCTTC TCTCTGAGAG CCX3GGCAGAG 
GAA6GCCAGG GCACAGGGG6 TTOGAOGGGC GATGCTGATG 
GCTGATGGCC CAGGGGGCAA TGCTCGCGGC CCAG6A6AGG 
GGTCOXGGG GGGCAGGGGC AGCAAGGGCC TCGGGGCOGA 
COGCATGGCG GTGCCCCTTC TGCGCAGGAT GGAAGGTGCC 



41 51 
1 I 

GGSGPRGAGA ARASGPGGGA 60 
EAELARRSLA QDAPPtiPVPG 120 
HITQCFLFVF LAQPPSGQRR 



I I 



GCTCCGGAGC 


CATGCAGGCC 


60 


GCCCAGGAGG 


CCCTGGCATT 


120 


OGGGTGCCAC 


GGGGGGCAGA 


160 


GAGGAGGOGC 


CC060GGGGT 


240 


CCTGCGGGGC 


CAGGAS6C0G 


300 
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GACAGCOGCC TGCTTCAGTT CCGACTGACT GCTGCAGAOC ACG60CAACT GCRGCTC TCC 360 

ATCAGCTCCT GTCTCCaCCA GCTTTCCCTG TTGATGTGGA TCACGCAGTG CTT'TCTGOOC 420 

GTGTrrTTGG CTCAGGCTCC CTCAGGGCAG AGGCGCTAAG CCCAGCCTGG CGCTOCTTCC 480 

TAGGTCATGC CTCCTCCCCT AGGGAATGGT CCCACCAOGA GTGGCCAGTT OVTTGIGGGG S40 

GCCIGATTGT TTGTajCTGG AGGAGGAOX; CTTACATGTT TGTTTCTGTA GAAAATAAAG 600 
CTGAGCTA 

5eq ZD mOz 212 Pzoteis sequence: 
Protein Accession 8: Eos sequence 

1 11 21 31 41 SI 

] I 1 I I 1 

MQABGQGTGG STGDAZXIPOG FGZPD6PGGN A06PCEAGAT GGfiGPKGAGA ARASGPRGGA 60 

PRGPHGGAAS AQDGRCPGGA RSPD5RX<LQ? RLTAADHSQL QLSISSCLQQ L5LLMWITQC 120 

FLFVFLAOAP SGQRR 

Seq ID NO: 213 OKA sequence 
NUcleic Acid Accession |: NM^OOOSSS 
Coding sequence: 416.. 1498 

1 11 21 31 41 SI 

11)111 

CTTATTTTTT ATGAATGTOG GATAGCTGCA CCAGCTTQGT GGGGAAAGGG TTTG ATGA AT 60 

AGCACAAAGA CACTGGCTGT TCCXTGGAGG CTGTCOCrrT AAAGGAGAAT CTTAOTTTAT 120 

TCTGGGGGGA GGGGATGCAC ACATTAGAGT AGGAAAGAGG GCTTGGAATA AAATGAAAAC 180 

ACTCCCCCTT CATACjTCATT GTACTGAAAT GCAAAGACIG CTTCCTAAGC TGGAGATGCT 240 

AACCTTGGGT AGCTCCTTCT GTTCTCTTCA AGGGGAATTT TGTC AGGCT A TGGATTCATT 300 

TACAACTGTT AGTCATGTGG GCATGTGTGA GGAAACAGAT GCCAGTTTTA ATGTATTTAG 360 

CCOGAAGTTC CAATTTGATA GGAGCCACTG TCAGTCTCTG AGGTTCCACC AAAATATGGA 420 

ACTTGATTTT GGACACTTTG ACGAAAGAGA TAAGACATCC AGGAACATGC GAGGCTCCCG 480 

GATGAATGGG TTGCCTAGCC CCACTCACAG CGOCCACTGT AGCTTCTACC GAACCAGAAC 540 

CTTGCAGGCA CTGAGTAATG AGAAGAAAGC CAAGAAGGTA OGTTTCTACC G CAATG GGGA 600 

COGCTACTTC AAGGGGATTG TGTACGCTGT GTCCTCIGAC OGTTTTCGCA GCTTTGACGC 660 

CTTCCTGGCT GACCTGAOGC GATCTCTGTC TGACAACATC AACCTGCCTC AGGGAGTGOG 720 

TTACATTTAC ACCATTGATG GATCCAGGAA GATOSGAAGC ATGGATGAAC TGGAGGAAGG 780 

G6AAAGCTAT GTCTGTTCCT CAGACAACTT CTTTAAAAAG GTGGAGTACA CCAAGAATGT 840 

CAATCCCAAC TGGTCTGTCA ACGTAAAAAC ATCTGCCAAT ATGAAAGCCC CCCAGTCCTT 900 

GGCTAGCAGC AACAGTGCAC AGGCCAGGGA 6AACAAGGAC TTTGTGOGCC CCAAGCTGGT 960 

TACCATCATC CGCAGTGGGG TGAAGCCTCG GAAGGCTGTG CGTGTGCTTC TGAACAAGAA 1020 

GACAGCCCAC TCTTTTGAGC AAGTCCTCAC TGATATCACA GAAGCCATCA AACTGGAGAC 1080 

CGGGGTTGTC AAAAAACTCT ACACTCTGGA TGGAAAACAG GTAACTTGTC TCCATGATTT 1140 

CTTTGGTCAT GATGATGTGT TTATTGCCTG TGGTCCTGAA AAATTTOGCT ATGCTCAGGA 1200 

TCATTTTTCT CTGGATGAAA ATGAATGCCG AGTCATGAAG GGAAACCCAT CAGCCACAGC 1260 

TGGCCCAAAG GCATCCCCAA CACCTCAGAA GACTTCAGCC AAGA6CCCTG GTCCTATGOG 1320 

CCGAAGCAAG TCTCC3VGCPG ACTCAGCAAA CGGAACCTCC ACCAGCCAGC TCTCTACCCC 13 SO 

CAAGTCTAAG CAGTCTCCCA TCTCPACGCC CACCAGTCXTP GGCA6CCTCC GGAAGCACAA 1440 

GGACCTGTAC CTGCCTCTGT CCTTGGATGA CTOGGACTOG CTTGGTGATT CCATGTAAAG 1500 

GAGGGGAGAG TGCTCAGAGT CCAGAGTACA AATCCAAGCC TATCATTGTA GTAGGGTACT 1560 

TCTGCTCAAG TtSTCCAACAG GGCTATTGGT GCTTTCAAGT TTTTATTTTG TTGTTGTTGT 1620 

TATTTTGAAA AACACATTGT AATATGTTGG GTTTATTTTC CTGTGATTTC TCCTCTGGGC 1680 

CACTGATCCA CAGTTACCAA TTATGAGAGA TAGATTGATA ACCATCCTTT GGGGCAGCAT 1740 

TCCAGGGATG CAAAATGTGC TAGTCCATGA CCTTTCAATG GAAAGCTTAG G GGCCTG GGG 1800 

TAAATTTGCC CCGTTTAAAT TTGCCCAAAC AGTTTTCCTT TTGTAGAGGG GTGTTTAAAT 1860 

ATACAGCAAT TAAAAAGTTT GTGTGGGGAA AAAAAAAACT CATTGGCAGA TCCAAGAATG 1920 

ACAAACACAA GTGCCCCTTT TCTCTGGATC TCAAGAATGG TGGAGGACCC TGGAAGGACA 1980 

GCAAGGCAGC TCCCCAGCCT CACTCTTCAC TCCTGATTGA GGCCCGGGTT TGTTGTCCAG 2040 

CACCAATTCT GGCTGTCAAT GGGGACAAAT AAACCAACAA CTTATAATTG TGACACCAGA 2100 

7GCTTAG6AT CCTGGTGCTG GGTTAGCTAA GAGAATAGAC AfiAATTGGAA AATACTGCAG 2160 

ACATTTCCGA AGAGTTTATA AAGCACAGTG AATTCCTGGT CAATCTCTCC ACTGAGGCAA 2220 

TTTGGAATCA ATAAGCAATT GATAATAGTT TGGAGTAAGG GACTTCATAT ACCTGATTCC 2280 

TCTAGAAGGC TGTCTAACAT ACCACATGAT TACATGAACT GTATGGTATC CATCTATCTC 2340 

TGTTCTATTG AATGCCTTGT TAACAGCCAA CACTGAAAAC ACTGTGAGAA TrTGTTTTCA 2400 

GGTCTGACAC CTTTCAGTCT CTTTTTATAG CAAGAAATCA ATATCCTTTT TATAAAAATT 2460 

CATGTCTCTA TTTCaGGAGC AAACTCTTCA 6GCTCCTTTT TTATAAACTG GTGATTTrTC 2520 

TTTTGTCTAA AAAAOVCATG AAGAAAATTT ACCAGAAAAA AAAAAAAAAG COGAAGAATA 2S80 

ATGTTATTTA GAAATTATGC TGTCACTGCC AAACAGTAAC CTCCAGGAGA AAACAA6AT6 3640 

AATAGCAGAG GCCAATTCAA TAGAATCAGT TTTTTGATAG CTTTTTAACA GTTATGCTTG 2700 

CATTAATAAT TTCAATGTGG ACCAGACATT CTAATTATAT TTTAAATGAA ATGTTACAGC 2760 

ATATTTTAAG CAACTCTTTT TATCTATAAT CCTAATATTT CATACTGAAG ACACAGAAAT 2820 

CTTTCACTTG TCTTTAACAT TAGAAAGGAT TTCTCTTTAC TAAGGACTGA TCATTTGAAA 2880 

TAGTTTTCAG TCTTTTGAGA TACAGGTTTA TAACACTGCT TTTTTTTTCC TGTAAACATA 2940 

GCCCATAATG GCAAAAACAA CTAATTTTAA TTGAAGGTCT TGCTTGCCAN TCCTGTGTTG 3000 

GCTTTNACCA AATATAAAAA TTCCCTTATT CCTTGGTAAT GGTGCAAATO TTTGGAAAGG 3060 

CACAGCATCC AAACCAAGCT GCTGTTTGGC TACTGAATGG CTTGCAGTTG TTCCTCCACT 3120 

CTAAATCGAA TGAGCTTGCT GTGTGTGTX3T GTGGTGGTGG TGGGAGGGGG TGGTGCATGT 3180 

GTGTGTGTGT GTGTGCATCT GCAGCTGCTT CAAAATTAAO AAATACTACA AGACACCCCT 3240 

GTAATQGATT GGTGGCAACT GGGTGGCACT GCTG^TGTGC ACTGT6TAGG GGGGAACCCA 3300 

GTGGTGGTGG GGTATCTCAA ATGCCCCTAG ACAAGCTTCA GAtGTCTGTA GCTACCAAAA 3360 

ACATTTTCGG TTCAAGAAAA GTGAGATGAT GGTAGTACTG GTTTCTGGTG AAATTGAAAA 3420 

ACCCCAAATG ATGAGGATCT CTTTTTGCCC CCTCTCCTTT TTTTGTAAAC CCATTCAAAA 3480 

CCATTAATAA GCCCATTTTA CTAANCCCCT ATTTCTTTCT AGAAGCTCAG GGTTTNCTTA 3540 

GTGCCTCCCA NAACATTTTG TAGTTAATTG GGAAAAAGTG ATACTTGGAT TAGGGGGTGT 3600 

GGGCATAAAO AATGGTGGGA GGCCTGATTT TAAAATTCAG GGCAGAACCC CCAATGACTC 3660 

CACCCATAGT HTCACTTTAG GTCTCATTTA GTCCRTCACC TTTATmAA GTTGAGGAAG 3720 

TGGAGGCTGG TAAAGAGCAG GACCAGAGGA AGAATCCAGA TTTCCTTATG CTTGGGCCTC 3780 

ACACTAGCTC TNT6AGTATT TCCTTGATTG OGGTATATGT ACTACTAGAA AATACCAAAT 3840 

GGATATATTT TCTTTAGGAT AACCTTTGAA CCAACAATNT TCAATAACAA TAGTACATCT 3900 
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TCCATCTTAC TTTTAATOSA GTATAAGGAA 
AGGGGATGAG GCTPGGCATA GTCCAAAATT 
TTGTTTTAAA TTGGCCCACT TTCAAGGCAA 
CCACCCCTGT CATTCACTTC CAATTTTACC 
5 TCTTAATTTT TQCAOGGTCT ACACACATCA 
ACTTCTCCCT CTTTTTTACA CACACACACA 
TTCCTACCTC CCTGATTTTT CTTCCCTACA 
ATGTATATAT TGGGGCTGOG CTGAAGAACT 
ATTGAGAGAA AAGCTCCTTT TCTCTTCACT 
10 GCTTTGTGTC CTTAIG6ACT TTAGTATTAG 
GJU^GGAACCT TAAGATCACA TCATCTACTC 
AGGAAACCGA GACACAGAGG TAAAGTAATT 
GATTGGGTTr ACRACCCACA TCTCCTGGCr 
TATTGCCTTC CATTAGGCTC CTGAGAGTTA 
15 ACATGCTGCT GCXXTGATCT CAGTGGGAAA 
CCCTGCATTC ACCTGGTTCC CATOCACATG 
ATTGAGGGGC AATAGGAGCA ATGGGGTCCC 
TGCTCCATCX3 TGAGGAGCCC TCTGAATAGC 
GAATGGAGGA AGATTGATTT TCTCCATCAG 

20 CTTTCCAGGC TGAGGGAAAT GTTTCTTGTT 
ATANCTTTGT TCATCCTAAC TTTCTGAGAT 
GGTACCATTC ACTGGCANGA TTTNTTTTAG 
AAGCTGGTCr CACTOTGGTT GCCCTCATCC 
GAGGACAATT TOCASGZATA AGCAAGGGGC 

25 TAAACATTGG CTOCTGTGTT TGCACCAAAA 
CATCGTCTTG TGTACACTGC TCCTGTGGCC 
CAAACACATG GTTTTCCTTG CTGCAAGGCT 
TTCAGTTNTA AGAGACCTCC TTCTGGGCTT 
CCTCCTTCTC CTCCACASTC ACAAGTAAOC 

30 TGAA6AAGGC AAGGAACGCT GAGATTCTTC 
GTGATTGGTG CTTACCTTGA ACAAAATTTT 
GGTACAATGC TCCCAATCAC CCTGCACATT 
TCCATATCCC TAGGACAAGA NAACAGGATG 
TGATGGGAAT GATCCCAANG ATCACCCCAC 

35 CCAGATAGAA NCACTGGGAC AGTGGTTTGA 
AtGGAAATAA AAG6CATTGA TT7TTTAAAA 
AGOGCCACTT GGATCCATTT CCAGGCCTTA 
GCTTTAAGTC CCA6ACTG6T CTCCCAAGTG 
TGAGGCATGA GAATGTTGCC CCATCTATCC 

40 TCCTAAAGCC TGGTCCCCAA AAATTGTTTT 
CACCCAKACT CTTAGTGTTG CGTCCTGCCT 
CCCGCTTTGG CTTAGCTAGC GTGACATTGS 
TTTTTTTTTG ACTGAGTCTC CCTCTGTCAC 
CTCGCTGCAA CCTTCACCCT TCACCTCCCA 

45 OGAGTAGCTG GGATTACAGG CGTGCGCCAC 
TTTTAGTAGA GATGGGGTTT CACCATGTTG 
ATTATCTGCC CACCTCGGCC TCCCAAAGTG 
GCTGACAAGA CTAATTTTTT ATCCCTTGGT 
GGTGATTTTT TCTTACCTTG GATGCCTGAG 

50 TTAAGGCATC TTTCTGCTCC TGATCAGAAG 
CAACAGAAGT CACCTTGTAA GTAA6GCAAA 
TTAGGTCAAT AACCTTGAGG GAATCAATGG 
TCTTTGACTT TTCTTTCTCT GTCTAGTTTC 
AGTCTCTCTT TCCACAGTAC AAACATCCAT 

55 CTTATTATCT TCATTTGTftC TTTTTCCTTC 
TTCTTAGCCT GTGATTTTGC CTTGGGACTG 
CTGCrCCTAC CCCAGTCCAA TCAGAAGTAT 
TTTCTTCTTC TCCATTTTCA TTCGTAATCC 
TATAGCTCAT GTATCTTTAG GTCTTTGCCT 

60 CCTTTTTAGT CTGACATTTT GTGGAGCAGT 
GAGAAAAAAT CCACCCATGG ATTTATATCA 
GTGCCCATAA TTTTTAAAGC TGCAATATAA 
GTCATTT6TT TtGGCTGGAT GGGGGTGGG|G 

^ ATTAAACTCT CTATAATAAT CTTGTTTGGG 

65 ACTGGTAGGC AATCGGAGTT GATTTAAATG 
AAAGAGACAT TTGCTTAATT GACATGTATT 
CTTGACACCA ACTGTTCATG ATACTGAATA 
TAAAGAAGCC A6ATTGTAG6 TOTTAATTTA 
TCACTCCTTG GCAATACX^T ATG6CATGCC 

70 ATCCAAAAGG GATTTGAACA AGTAAGAGGT 
GTAATATATT GCAGCTTGAA GCCAATGATC 
TTTATTCAAT TTTGCATTTC CCAOGTGTGG 
NTGTGCCATT AAACTTGTAC AGAAAATGTT 
AAAATGGAAA CAGCCCACCC TTTCTGCXrCT 

75 CAAAACAGCT GTAATTGGTG GTTGTAGTGT 
TGGAGAGTAA AT6CATGGTA TTGTACATCA 
GAATATATTC TTCTTTGTAG TCCTTCTTCC 
CCAGTTGTCT TACAGTTGTA AATATCTGAT 
TCAGCAAACA ACAAACAAAC CAAAATGTGG 

80 AGTTATTGAT CATTTCTTAA GGAACAGCAT 
TCAGTGGTAA ATTGGGGTTG TATIGGCCAT 
AATCACATGT AATCCAAAGA CAGTAGGTAG 
GATAGAGACC TCAGAAGACT CTGCTTGACC 
AAAAATGAGA GAAATAAAAC AGATATTTAA 

85 AGCCAGAAAA AAAAACAAGG GCATGAGTTC 
AOCTAACCTA CTCT6AAATT GTGATTCAAA 
TGGTTTGCTG ACCCCACTTG GACTGGTAGG 



PCT/US02/12476 

ATCTTTCTTP ATGGCCATTT TGGAGGGAGC 39G0 

TAACKCrCCA ATAATTAATT GCATTTTAAA 4020 

TTTTTTTTGT GTGTCTGTAA CTGAGCTCCT 40 BO 

CAATCCAATT TTAGCACTCA AGTTCCATTG 4140 

AGTCAGCAAG CATTTGCCAC CACTCCCTAT 4200 

CACACACACA CACAATCCAT CTCTTGCTTG 4260 

GAAATAGAAA TAGGGACAAA GAAGGGGAAA 4320 

AACTTCATAA GTAGTATTAA CTAGGGGTAA 4380 

GTTTTGGAAA GGATAGCCAT TAGCATGACT 4440 

CCTAGATTGA ATTATAGOGT TTTTCTAGCT 4S00 

CTCTACrCCA AATTTCrCAT TCTTCAG6CC 4S60 

TCOCCAAGGT CACACAGCTG GCTGGGGCAG 4fi20 

CTTATTOCAG GGCCmTCC CACTAAGTAG 4G80 

TTTCTCAGGG TCATGTTGCA TCTTGGAGCC 4740 

TNCACCCAGC AACCTAATAC AGCCCCTTTT 4800 

GGTTGCAGAT GTCCTTGAAG AGAGTGAGGC 4860 

TGGCCTTGTC CATCTGATTC AGGAGATCAC 4920 

CCCCCACTGA ATGCTTGCCT TGC CCAAATG 4980 

TTCACCTTGT GTCATCTCAT AATGGTTGGT S040 

TCCANAGTAH AAAAAAGAAA GA6TGGAACA 5100 

GGCTTTTCAA CATTTAAAAA AAACTAGTGT 5160 

AATATGGGAG TAAGATGAGG TAGAGAAAAT 522 0 

ACAATGTCCC CAAAGCCATC CTGCTNTGAT 5280 

TrrSTGACAA AAATGTAGOC TG6CTGATGT 5340 

TAGCAAGCTG TGTGCTCTAT ACACTCTTCC 5400 

TTCCACAGCA 6AAACCAGG6 CAAAAGGGTC 5460 

NTTCCTGGGA ACTAAQGGGG TATTTATTAG SS20 

AOCCCACTCC TCAGGTACTT CTCTCTCCTT 5580 

AAOGAACCTG AAAGTGGATG TGTAGCTATT 5640 

TTTGAATCCT TTAGTCCAAG TCTTAGACCA 5700 

GTCTGTGTTC CTAATCCCTT CAATACTNTG 5760 

TGATTCTAAA TGGCTTTTAT TTTTTAAAAA 5820 

CCTATATCCC CAAAATGAGC TCCAGGACAC 5880 

CTCAGAAAAC GTCTGTGCCA AKAGACTTCC 5940 

AOGACTTCIT TTATGGTTGT CCAGTTTGCT 6000 

AAGATGATTG GAACCTGTCT TTGGCCACAT 6060 

CTCATATATT GCCTTCACT6 AAGGGCTTTG 6X20 

AACCATAAGT GTTPTGGAGC TCATCTGGGG 6180 

CTTCAGGAAA AGC?rGCCTTC CCTCCCTTTC 6240 

TGTCTCCAAA AGTCTAGTAT GGTCTTTATA 6300 

TGTTTCCTTG TTAAGGATCT ATGCANACCT 6360 

CTATCATTTG ACAAGACTAA CTTTTTTTTT 6420 

CTAGGCTGGA GTGCAGT6GC ACAATCTTGG 6480 

GGTC6AAG0G ATTCTCCTGC CTCAGTCTCC 6540 

CAAATCTGGC TATTTTTTTA TTATTATTAT 6600 

GCCAGACTGG TCTTGAACTC TTGGCCTCAA 6660 

CTGGGATTAC AGGCATGAGC ACCATGOXA 6720 

TTATTGGCTT CAACATCTTC TGGAATCAGA 6780 

ACTAGGGGAG TATAGAATTC CAATTGGTAA 6840 

GGCAGGTTAG TTGGGAGAGG TCA6AT66CA 6900 

GACTTTGAAG GCATTAGOGT TTCTCATTAC 6960 

CTTTTTTGCC GCTCTAOCTC TTTGTGTATC 7020 

CTCTGTTCTC AGTTTATATT CTATGTTATC 7080 

CCTTT C TCCT GTGGAATTCT GTCTCTCCCT 7140 

CTCCCTGTCT AGGCATTGGG CATGTGCCTC 7200 

ATGATAAATT ATTTCCAGAT TCAATCAGCC 7260 

GTTGGTGGGG AATCAACCTG ATCCTGGCCC 7320 

CCCTCAGCAG ATCTTTACAA GCAGTTTOCT 7380 

TCCAAGCACT GTACAGAATA CTTTGTGGTT 7440 

GAAGCGTGCT CAGAGACATA ATCAGCTGAA 7500 

GCTAAATACT AATAATTGAT TTTGTTTGAT 7560 

TATAATGAGG GACCACAGGT AATTTCTCCT 7620 

GAGTAATTGC TTAAAGTTTT ACCATTACAC 7680 

GCTTGCTAAC TGTTGAGCTG TTTTAACTAA 7740 

AAAAGATAAT TTAACAAATC TATACTATAA 7800 

TTTTCCTTCT GAGTCACCTA AACATTTACT 7860 

GACAGTCCAT ATAA6AGAAA TTAGTGGACC 7920 

TTAAACAGAA TTGGAAA6CC CTTGGAAATG 7980 

AAAATTTACA ATGACTTTTC TTTATAAGTT 8040 

TATGCCAAAA TGTCTCCAAT GTATGGTCCT 8100 

CCTTATGACT TGTATACAAC TAATGCATGT 8160 

TAAGTCTTTA AAATGTTTTT GATCACCTTT 8220 

TTTATGCCCA TTTTCAAAGG GAGAAAGTTT 8280 

ATAGCTGTAG TTAGAATTGA GTACCTGTAG 8340 

TAGAGGTGTT AGCTTGCTAG TGACTAGCTT 8400 

CATTTCTTAA CTCGTTTTAA CCTCTGAAAA 8460 

CACOCCCTTG CCCTCTCCCT CTCCCTGCTC '8520 

TTGAGGCCCA ATAACTCTTG CCAAGTAAAG 8580 

GGAAAAGGCA TTTCTCAACC ATCTCTCAGC 8640 

TGTGATCAAA GACTCAACTT TACGTAAAAA 8700 

TGATTACATT CAGGATTGAA TAGTTTTCAG 8760 

TGATGTCCCT TATCCCTGCA GCTGTTTTAA 8820 

GATGACCAAT AATTATTTGA AAAAAAAAGA 8880 

GAACTTTAGC CACCTATTTA 6AATAGTTAT 6940 

AAATGCATTA CTATCAGTGT CCTAGGCAAT 9000 

AGCAGTATTT CAAGAGGCAT TCTOCTTTTT 9060 

TTTGGTGA6G CCCCCATAAA CCAGCTGGAG 9120 
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CACACCXTTT TCRTCTCCIG TCCCTGTAAC ACCCXTCTTC COCCftCCCCC TOO BOUITIC 9180 

AATGAGOSCr TTCTTCGGTC fiCAOGACTTC WMjGITCTCT AGACRAGTTT GOCAIGTeiG 9240 

TMUSerCCTG TGAACTCrGA GTGCTGAASA TTOGCAGCAT TOVATACCAG GCAGOCAAAG 9J0O 

MCTGCTCTT GCAATTATTT TGGCTCTCAA CCTCIGTTCT TCATCECATT CTCRTTTCIG 93SO 
TGTACATTTG CMGATGTGT GTAATGTCAT TTTCOVAAAA TAAAATTTGA TTTCAAT 
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10 
15 
20 
25 
30 
35 
40 



Seq ID NO: 214 Protein sequence: 
Protein Accession ftt MP_000546 



1 11 21 

I i t 

MELDPGHFDS RDKTSRNMRG SRMNGLPSPT 
GDRYFKGIVY AVSSORFRSF DAUADIiTRS 
EGBSYVCSSD KFPKKVEYTK NVNPNWSVNV 
ttVTIIRSGVK PRKAVRVLLN KKTAHSFBQV 
DFFGDDOVFZ ACX3PEKFRYA QDDFSLDSME 
MRRSKSPADS AMGTSSSQLS TPKSKQ5FZS 



31 
I 

HSAHCSFYRT 
LSDNINLPQG 
KTSANMKAPQ 
LTDITEAIKL 
CaVMKGNPSA 
TPTSPGSIiRX 



41 



51 
I 



RTLQALSNEK KAKKVRFYHN 
VHYirriDGS RKIGSMDBLB 
SLASSSSAQA RENKDPVRPK 
ETGWKKLYT LDGKQVTCLH 
TAGPKASPTP QRTSAKSPGP 
HKDLYLPLSL DDSDSUaSM 



Seq ID NO: 215 DNA sequence 
Nucleic Acid Accession »: NM_1304S7 
coding sequence: 3X2.. 644 



GGCAOGAGGC 
CTTTCCAACA 
GTCTTCCTGG 
TCCTGTGGCA 
CCCAGGTCGT 
AAGTGAGAGA 
AAGAGTCTTC 
AAGAAGAGGA 
AAGGAGCACC 
TTAAGATAGA 
TTGATCCXaC 
ATGAAGACTG 
TAATAAAGTT 



11 
I 

AGAGCTCTGC 
TCTTCGTTCT 
TAATTTAGTT 
CAGTCCXJTGG 
GATGCAG6CG 
TATGAGTGAG 
CCAGCCAGTT 
ACCACCAACT 
TGCTGTTCAA 
GGATGCACCT 
TAAAGTGCTG 
AAAOCAAGAA 
TTACAGTTTT 



21 
I 

AAGGAGAGGT 
TTCTCACTGA 
GTGAGTGAAT 
CTTTGAGGGA 
CCATGGGCOG 
CATGTAACAA 
GGACCTGTGA 
GATAATCAGG 
GGGACTGATG 
GGAGATGGTC 
GAAGCA6GTG 
TATTGTTCTT 
CTGCAAAAAA 



31 

1 

TGTGTCTTCG 
CCGAGACTCA 
GTGTGGAGGA 
AAAGGGCCrC 
GTAATCXjTGG 
GATOOCRATC 

ttgtccagca 

GTATTGCACC 
TGGAAGCTTT 
CTGATGTCAG 
AAGGGCAACT 
ATGCTGGAAA 
AAAAAAAAAA 



41 

! 

TTCTTTCCGC 
GCXX^GTAGGT 
GCCAGCGGGC 
GOGGTGGTCC 
CTGGGCTGGA 
CTCAGAAAGA 
GCCCACTGAG 
TAGTGGGGAG 
TCAACAGGAA 
GGAGGGGACT 
ATAGGTTTAA 
TTTGACTGCT 
AAA 



51 

I 

CATCTTCX5TT 
CTGCAGAGTG 
TTAGGACAGG 
TCOGCCTTCC 
ACGAGGGAGG 
GGAAATGACC 
GAAAAACGTC 
ATCAAAAATG 
CTGGCTCTGC 
CTGCCCACTT 
ACCAAGACAA 
AACATTGTCT 



60 
120 
180 
240 
300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



Seq ID NOs 216 Protein sequences 
45 Protein Accession #: NP_S69734 



MSEHVTRSQS SER6NDQESS QPVGPVIVQQ PTEEKRQEEE PPTDNQGIAP SGEIKNEGAP 
50 AVQGTDVBAF QQSLALLKIE DAPGDGPOVR EGTIiFTFDPT KVLEAGEGQL 



55 
60 
65 
70 
75 
80 
85 



Seq ID NO: 217 DNA sequence 
Nucleic Acid Accession U-. N«_001476. 
Coding sequence: 82.. 43 5 



gccagg6agc 
tgagattcat 
ccaaggcgct 
gatgaagtgg 
gcagctgctc 
gctgatagcc 
gggcaggagg 
caatcacagt 
ttgttcatta 



11 

I 

TGTGAGGCAG 
CTGTGTGAAA 
ATGTACAGCC 
AACCAGCAAC 
AGGAGGGAGA 
AGGAACAGGG 
TGGACCC3GCC 
GTTAAAA6AA 
AAATTCTCCC 



21 
I 

TGCTGTGTGG 
TATGAGTTGG 
TCCTGAAGTG 
ACCTGAAGAA 
GGATGAGGGA 
TCACCCACAG 
AAATCCAGAG 
GACACGTTGA 
AATAAAGCTT 



31 
I 

TTOCTGCOST 
OGAGGAAGAT 
ATTGGGCCTA 
GGGGAACCAG 
6CATCTGCAG 
ACTGGGTGTG 
6AGGTGAAAA 
AATGATGCAG 
TACAGCCTTC 



41 
I 

CCGGACTCTT 
CX3ACCTATTA 
TGOSGCCCGA 
CAACTCAACG 
GTCAAGGGCC 
AGTGTGAAGA 
CGCCTGAAGA 
GCTGCTCCTA 
T6CAAAA 



51 
I 

TTTCCTCTAC 
TTGGCCTAGA 

GCAGTTCAGT 
TCAGGATCCT 
GAAGCCTGAA 
TGGTCCTGAT 
AGGTGAAAAG 
TGTTGGAAAT 



Seq ID KO: 216 Protein sequence: 
Protein Accession #: HP_001467.1 

1 11 21 31 41 51 

I I i 1 I i 

MSWRGRSTYY WPRPRRYVQP PEVIGPMRPE QFSOEVEPAT PEEGEPATQR QDPAAAQE6E 
DEGASAGQGP KPEADSQEQG HPQTGCECED GPDGQEVDPP NPEEVKTPEE GEKQSQC 

Seq XD NO: 219 DNA sequence 
Nucleic Acid Accession # : NM_001476 
Coding sequence: 90-3671 



ACAGCGGAGC 
AGACAGAGAC 
GCTTCEGGCT 
ATGGGAAGTC 
TCOGCTGCCT 
GCTTTTACCG 



11 

! 

GCAGAGTGAG 
TGAGCGGCCC 
CCTCCTGCCC 
CAGGCAGTGT 
CAACTGCAAT 
GCACAGAGAA 



21 
1 

AACCAGCAAC 
GGCACCGCCA 
GC AGCC CGGG 
ATCTTTGATC 
GACAACACTG 
AGGGACGGCT 



31 
I 

CGAGGCGCCG 
TGCCTGCXSCT 
OCACCTCCAG 
GGGAACTTCA 
ATGGCATTCA 
GTTTGCCCTG 



41 

1 

GGCAGOGACC 
CTGGCTGGGC 
GAGQGAAGTC 
CAGACAAACT 
CTGGGAGAAG 
CAATTGTAAC 



SI 

I 

CCTGCAGCGG 
TGCTGCCTCT 
TGTGATTGCA 
GGTAATGGAT 
TGCAAGAATG 
TCCAAAGGTT 



60 
120 

180 
240 
300 
360 
420 
480 



60 



60 
120 

lao 

240 
300 
360 
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10 
15 
20 
25 

30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



CrCTTAGTGC 
0CA6ATG0GA 
ACCAGAGACT 
A060GGGC0G 
GAGGTTACTA 
GGCATTCAGC 
TTCATCAAGA 
AATGGTCACA 
TTGTGGCTCC 
TTGACTACCG 
GTGCTGGTCT 
TCACCAAGAC 
TGACTTACTT 
CATATGGAGA 
TCTCTGGAGC 
AATTCTGCCA 
GCACCTGTAT 
GTTATTCAGG 
ACGATCCGCA 
CAGTGATGCC 
CC0GC7GTGA 
TGAGGCC7TG 
GTGACCGGCT 
ACCAGTGCAA 
GACCTTGCAA 
GTGTTTGCAA 
CTTGCTATAA 
AGGCCCTGAT 
GCAGGATGCA 
AAGGTGCTAG 
ACCAGAGCCG 
AGTACCAGAA 
CA6AAAGTGA 
CAAATGGCTT 
CAGCCAGTAA- 
CACTGGTGOG 
CTGTGGTGCA 
CAAGGGAGGC 
TCCTGGATTC 
CAAAGAGGAT 
AGTTCAAGOG 
AGAATGGAAA 
AAAGCAGAGC 
TCCTTAAAAA 
AAGCCATGAA 
AGCAAGCAGA 
CGGGGGAGGC 
AAGCCAATGT 
GTGAGATGAG 
TGGATGCAGT 
CTGGGGTTAC 
AGCCTCTCAG 
AGACCCAGAT 
AGCAGAGGGG 
AGAACTTGGA 
AGCAACAGTG 
GGCTCGGGAG 
TATGCTCAGG 
TGCACCATAC 
ATGATCAAG6 
ATGTTTGCCT 
ATAGTCAACT 
ATGAAATTCT 
ACTATTGCCT 
ACCCAGGGTG 
AGGACCTGTA 
GTTCTGGACC 
ATTTTTATTA 
GTTTCAAAGT 
ATTAiGTCCTA 
ATCTTATTTT 
CACACTTCAG 
TTACCTCCAT 
GTGGGACAGT 
AGCATTTTTA 
GCAATAACOG 
CATGGGGGCA 
CATTCCAGCT 
TAACACCAGT 
TGGTGCTGCC 
CAATTGTTAG 



TCGATGTGAC 
CCGATGTCTG 
GCXAGACTGC 
CTGTGTCTGC 
TAATCTGGAT 
CAGCTGCCGC 
TGTTGATGGC 
GCGCCATCAA 
TGCCAAATTT 
TGTGGACAGA 
ACGGATCACA 
TTACACATTC 
TGAGTATCGA 
ATACAGTACT 
CCCAGCACCC 
GGATTGTGCT 
TCCTTGTAAC 
GGATGAGAAT 
OGACCCXXX3C 
GGAGACGGAG 
GCTCTGTGCT 
TCAGCCCTGT 
GACAGGCAGG 
AGCAGGCTAC 
CTGTAACCCC 
GCCAGGATTT 
TCAAGTGAAG 
TTCAAAGGCT 
GCA06CTGAG 
CAGATCCCTT 
CCTGGATGAC 
CCX3AGTTCGG 
AGCTTCCTTG 
TAAAAGTCTG 
• CATGGAGCAA 
CAAGGCCCTG 
AGGGCTTGTG 
CACrCAAGCG 
AGTGTCTCGG 
CAAACAAAAA 
TACACAAAAG 
AAGTGGGAGA 
ACAAGAAGCA 
CCTCAGAGAG 
GAGACTCTCC 
AAGAGCCCTG 
CCTGGAAATC 
GACAGCAGAT 
GGAAGTGGAA 
ACAGATGGTG 
AATCCAAGAC 
TGTAGATGAA 
CAACAGCCAA 
CCACCTCCAT 
GAACATTAGG 
AAGCTGCCAT 
CCATGTCATG 
TCAACTGACC 
TCCTTGCTTC 
ATCTGGACCC 
CATAATACTC 
TATTGTTTGA 
TCCTAATGTC 
CATATTGTCC 
TGAACATGTT 
AGGCAGGCCC 
TGGGCATGAC 
AAGCATTTCC 
GATAGAAAAG 
ATTCAATCCT 
CTCAATCTCC 
CTGGGTCACA 
CCATCCTTCC 
GGTGACATAG 
AAAAATAAAT 
CTTGGTTTGC 
CTTGAGTTTT 
GTCACTCTGT 
GGGAATTGCT 
TTGCTTCTGT 
ATGGC 



AACTCTGGAC 
OCAGGCTTCC 
AAGTGTGACT 
AAGCCAGCTG 
OGGGGGAACC 
AGCTCTGCAG 
TGGAAGGCTG 
GATGTGTTTA 
CTTGGGAATC 
GGAGGCAGAC 
GCTCCCTTGA 
AGG7TAAAT6 
AGGTTACTGC 
GGGTACATTG 
TGGGTTGAAC 
TCTGGCTACA 
TGTCAAGGGG 
CCTGACaTTG 
AGCTGCAAGC 
GAGGTGGTGT 
GATGGCTACT 
CAATGCAACA 
TGTTTGAAGT 
TTCGGGGAOC 
ATGGGCTCAG 
GGTGGCCCCA 
ATTCAGATGG 
CAGGGTGGTG 
CAGGCGCrrC 
GGTCTCCAGT 
CTCAAGATGA 
GATACTCACA 
GGAAACACTA 
GCTCAGGAGG 
CTGACAAGGG 
CATGAACGAG 
GAAAAATTGG 
GAAATTGAAG 
CTTCAGGGAG 
GCGGATTCAC 
AA7CTGGGAA 
GAGAAATCAG 
CTGAGTATGG 
TTTGACCTGC 
TACATCAGCC 
GGGAGCGCTG 
TCCAGTGAGA 
GGAGCCTTGG 
GGAGAGCIGG 
ATTACAGAAG 
ACACTCAACA 
GAGGGGCTGG 
CTGCGGCCCA 
TTGCTGGAGA 
GACAACCTGC 
AAATATTTCT 
TGAGTGGGTG 
TGACCCCATT 
CTGATGCTGG 
CAAAGAATAG 
GTAAGTGGAG 
GTAATGTGAC 
AGAACAGAGT 
TCTGCAAGCT 
CTCCATTTTC 
ATTCAGAGCT 
ATCCTTTCTT 
TACCAGCAAA 
TGTGGCTTGG 
ACTTTTCGAA 
TCTCTCTTTC 
TCCATCCCTC 
AACATATATT 
TCTCTGCCCT 
TTAAACTTAC 
AACCTCTTTG 
GGCAAGGCTG 
GCCTTTCTAC 
GGAGGAACCA 
ATTTCCTTGG 



GGTGCAGCTG 
ACATGCTCAC 
GTGACOCAGC 
TTACTGGAGA 
CTGAGGGCTG 
AATACAGTOT 
TCCAAOGAAA 
GCTCAGCCCA 
AACAGGTGAG 
ACCCATCTGC 
T6CCACTTGG 
AGCATCCAAG 
GGAATCrCAC 
ACAATGTGAC 
AGTGTATATG 
AGAGAGATTC 
GftGGGGCCTG 
AGTGTGCTGA 
CATGTCCCTG 
GCAATAACTG 
TTGGGGACCC 
ACAATGTGGA 
GTATCCACAA 
CATTGGCTCC 
AGCCIGTAGG 
ACTGTGAGCA 
ATCAGTTTAT 
ATGGAGTAGT 
AG6AGATTCT 
TGGCCAAGGT 
CTGTGGAAAG 
GGCTCATCAC 
ACATTCCTGC 
CCACAAGATT 
AAACTGAGGA 
TGGGAAGCGG 
AGAAAACCAA 
CAGATAGGTC 
TCAGTGATCA 
TCTCAACGCT 
ACTGGAAAGA 
ATCAGCTGCT 
GCAATGCCAC 
AGGTGGACAA 
AGAAGGTTTC 
CTGCTGATGC 
TTGAACAGGA 
CCATGGAAAA 
AAAGGAAGGA 
CCCAGAAGGT 
CATTAGACGG 
TCTTACTGGA 
TGATGTCAGA 
CAAGCATAGA 
CCCCAGGCTG 
CAACTGAGGT 
GGATGGGGAC 
CCTGATCCCA 
GCAATGAGGC 
ACTGGATGGA 
TCCTGGAATT 
TAAAGGAAAA 
GCAAOCCAGT 
TCTTGCTGAT 
AAGCTGGAAG 
ATGGTGCTTG 
TTAATGATGC 
GCAAATGTTG 
GCATTGAAAG 
CACCAAAAAT 
CTCCACCCAT 
CATTCATCCT 
TATTGAGTAC 
CATAGAGTTG 
AAACTTTGTT 
CTGAACAGAA 
ACAGAGCTCr 
AACTGATTGC 
GAGGCACTTC 
ATTTTCCT6A 



TAAAOCAGGT 
GGATGOGGGG 
TGGCATOGCA 
AOGCTGTGAT 
TAOOCAGTGT 
OCATAAGATC 
TGGGTCTCCT 
AOGACTAGAC 
CTATGGGCAA 
OCATGATGTG 
CAA6ACACTG 
CAATAATTGG 
AGCCCTCOGC 
CCTGATTTCA 
TOCTGTTGGG 
AGCGAGACTG 
TGATCCAGAC 
CTGCCCAATT 
TCATAACGGG 
CCCTCCOGGG 
CTTTGGTGAA 
CCCCAGTGCC 
CACAGCOGGC 
CAACCCAGCA 
ATGTGGAAGT 
TGGAGCATTC 
GCAGCAGCTT 
ACCTGATACA 
GAGAGATGCC 
GAGGAGCCAA 
AGTTCX^GGCT 
TCAGATGCAG 
CTCAGACCAC 
AGCAGAAAGC 
CTATTCCAAA 
AAGOGGTAGC 
GTOCCXCGCC 
TTATCAGCAC 
GTCCTTTCAG 
GGTAACCAGG 
AGAAGCACAG 
TTOCCGTGCC 
TTTTTATGAA 
CAGAAAAGCA 
AGATGCCAGT 
ACAGAGGGCA 
GATTGGGAGT 
GGGACT GGCC 
GCTGGAGTTT 
TGATACCAGA 
CCTCCTGCAT 
GCAGAAGCTT 
GCTGGAAGAG 
TGGGATTCTG 
CTACAATACC 
TCTTGGGATA 
ATTTGAACAT 
TGGCCAGGTG 
AGATAGCACT 
AAGACAAACT 
TGGACAAGTG 
AACTTTGACT 
CACACTQTG6 
CAGAGTTCCT 
AAGTGAGCAG 
CTGGTGCCTG 
CATGGCAACT 
GGAAAGTATT 
AGGTAAAATT 
GATGCGCATC 
AATAAGAGAA 
IXXATCCATC 
CTACTGTGTG 
ATTGTCTAGT 
TGTCACAAGT 
CATATGTTGC 
GGGTTGTGCA 
AACAGACTGT 
CACCTTGGCT 
AA6TGTTTTT 



GTGACAGGAG 
TGCACCCAAG 
GGGCCXrrGTG 
AGGTGTOGAT 
TTCTGCTATG 
ACCTCTACCr 
GCAAAGCTCC 
CCTGTCTATT 
AGCCTGTCCT 
ATTCTGGAAG 
CCTTGTG6GC 
AGCCCOCAGC 
ATCCGAGCTA 
GCCC6CCCTG 
TACAAGGGGC 
GGGOCTTTTG 
ACAGGAGATT 
ggtttctaca 
TTCAGCTGCT 
gtcacoggtg 
CATGGCCCAG 
TCTGGGAATT 
ATCTACT606 
GACAAGTGTG 
GATGGCACCT 
AGCTGTCCAG 
CAGAGAATGG 
GAGCTGGAAG 
CAGATTTCAG 
GAGAACAGCT 
CTGGGAAGTC 
CTGAGGCTGG 
TAOGTOGGGC 
CACGTTGAGT 
CAAGCCCTCT 
CCGGACX3GTG 
CAGCAGTTGA 
AGTCTOCGCC 
GTGGAAGAAG 
CATATGGATG 
CAGCTCTTAC 
AATCTTGCTA 
GTTGAGAGCA 
GAAGCT6AAG 
GACAAGACCC 
AAGAATGGGG 
CTGAACTTGG 
TCTCTGAAGA 
GACACGAATA 
GCCAAGAAOG 
CTGATGGACC 
TCCCGAGCCA 
AGGGCACGTC 
GCTGATGTGA 
CAGGCTCTTG 
CAGATCTCAG 
GTTTAATGGG 
GTTGTCTTAT 
GGGTGTGAGA 
GCACAGGCAG 
CTGTTGGGAT 
TTGCCCAGGC 
CCAGTAAAAT 
CCTACTTACA 
TGTTGGAGTG 
CCACCTTCAA 
TAGAGATTGC 
TACTTTTTOG 
CTCTAGATTT 
AATGTATTTT 
TGTTCCTACT 
TTTCCATCCA 
CCAGGGGCTG 
GAGGAAGACA 
GGTGTTTATT 
AAGACCCTCC 
CATTTCTTTG 
TGAGTTATGA 
GGGAAGACTA 
AAATAAAGAA 



420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 



Seg ID NO: 220 Protein sequence; 
Protein Accession ft:NP_005SS3 



11 
I 



21 
I 



31 



41 



51 



274 



MPAUOjGCOj CFSUJ/PAAR atsrrevcdc kgksrqcifd relhrotgmg FBCLKOnaiT 60 

DGIECEKCKK GFYBHRBRDR CLPCKQJSKG SLSAROJSSG RCSCKPGVTG ARCDHCLeGF 120 

HMLTCAGCIQ IXIRLLDSXCD CDPAGIAGPC DACSCVCKPA VTGERCDRCB SGYYKI£H3GS 180 

PBGCIQCPCy CmSASCSSSA EYSVHKITST F^DVDGWKA VQHKGSPAKL QaSQfiHQDVF 240 

SSAQBUaPVY PVAPAKPLCW QQVSYQQSLiS FDYRVDSGGR HPSAHDVILS GAGIAITAPL 300 

MPLGKTU»OG LTKTVTFRLN EHPS!CIHSPQ LSYFEYRRLL HMLTALBI8A TYGBlfSTGYI 360 

DNVTIiISARP VSGAPAPWVE QCICPVGYKG QFCQDCASGY KRDSARIGPP GTCIPCWCQG 420 

GGACDPOTCaj CYSGDENPDI ECADCPIGFY KDPKDPRSCK PCPCHNGPSC SVMPETHEW 480 

CKNCPPGVTC ARCEIiCADGY FGDPFGEHGP VRPCQPCQCN NNVDPSASCaJ CDRLTGRCUC 540 

CIHSTAGIYC DQCKAGYTOD PIAPNPADKC RAQICKPMGS BPVGCRSDGT CVOCPGFGGP 600 

KC2HGAFSCP ACYKOVKIQM DQPIflQQUjaM EALXSIOQGG DGWHrrELS GRKQQAEQAL 660 

QDIUIDAQIS iX»SRSLGLO LAKVRSQENS YQSSUJDLKM TVERVRALGS QYQKRVRDTH 720 

RLITQJ^LSL AESEASLOJT NIPAS0HYVG PNCFKSLAQB ATRIAESHVE SASKMEQLTR 780 

BTEDYSKOAL SLVRKAUIEG VGSGSGSPDG AWQGLVEKI* EKTKSIAQQL THEATOASIB 840 

ADRSYQHSLR LLDSVSRLQG VSDQSFQVEE AKRIKQKADS I,STLVTRHJ£D EFKRTQKNM 900 

KWKBBAQQLI. QKQKSGREKS DQLLSRAMLA KSRAQEALSM GNATFYEVES ILKNLREPDL 960 

QVDNRKABAE EAMKSLSYIS QKVSDASDKT QQAERALGSA AADAQRAKNG AGEALBISSS 1020 

lEQEIGSUIL EANVTADGAL AMEKGLASLK SEMRBVBGEL ERKELEFDIII KDAVQMVITE 1080 

AQKVDTRAKN AGVTIQDTLN TUDGLI;HLMD QPLSVDBEGL VliBQKLSBA RTQIKSQLRP 1X40 
MMSEIiEERAR QQRGHLHLLS TSIDGILADV KKLENIRDNI. PPGCYNTQAL EQQ 

Seq 10 KO: 221 DNA sequence 
Nucleic Acid Accession ft: KM_pl6S29 
Coding sequence: 13-1854 

1 11 21 31 41 SI 

GTCAAGAAAA GAATGTCTGT AATTGTTCGA ACTCCTTCAG GAOGACTTCG GCTTTACTGT 60 

AAAOWGCTG ATAATGTGAT TTTTGAGAGA CTTTCAAAAG ACTCAAAATA TATGGAGGAA 120 

ACATTATGCC ATCTGGAATA CTTTGCCACG GAAGGCTTGC GGACTCTCTG TGTGGCTTAT 180 

GCTGATCTCT CTGAGAATGA GTATGA6GAG TGGCTGAAAG TCTATCAGGA AGCCAG CACC 240 

ATATTGAAGG ACAGAGCTCA ACGGTTGGAA GAGTGTTAOG AGATCATTGA GAAflAATTTG 300 

CTGCTACTTG GAGCCACAGC CATAGAAGAT OGCCTTCAAG CAGGAGTTCC AGAAACCATC 360 

GCAACACTGT TGAAGGCAGA AATTAAAATA TGGGTGTTGA CAGGAGACAA ACAAGAAACT 420 

GCGATTAATA TAGGGTATTC CTGCCGATTG GTATCGCAGA ATATGGCCCT TATCCTATTG 480 

AAGGAGGACT CTTTGGATGC CACAAGGGCA GCCATTACTC A6CACTGCAC TGACCTTGGG 540 

AATTTGCTGG GCAAiGGAAAA TGAC3GTGGCC CTCATCATOG ATGGCCACAC CCTGAAGTAC 600 

GOGCTCTCCT TOGAAGTCOG GAGGW3TTTC CTGGATTTGQ CACTCTCGTG CAAAGCX5GTC 660 

ATATGCTGCA GAGTGTCTCC TCTGCAGAAG TCTGAGATAG TGGATGTGGT GAAGAAGOSG 720 

GTGAAGGCCA TCACCCTCGC CATCGGAGAC GGCGCCAAOS ATGTCGGGAT GATOCAGACA 780 

GCCXAOGTCG GTGTGGGAAT CAGTGGGAAT GAAGGCATGC AGGCCACCAA CAACTC3XSAT 840 

TACGCCATOG CACAGTTTTC CTACTTAGAG AAGCTTCTGT TGGTTCATGG AGCCTGGAGC 900 

TACAACCGGG TGACCAAGTG CATCTTGTAC TGCTTCTATA AGA ACGT GGT CCTGTATATT 960 

ATTGAGCTTT GGTTCGCCTT TGTTAATGGA TTTTCTGGGC AGATTrTATT TGAACGTTGG 1020 

TGCATOGGCC TGTACAATGT GATTTTCACC GC I TTGCOGC CCTTCACTCT GGGAATCTTT 1080 

GAGAGGTCTT GCACTCAGGA GAGCATGCTC AGGTTTOCCC AGCTCTACAA AATCACOCAG 1140 

AATGGCGAAG GCTTCAACAC AAAGGTTTTC TGGGGTCACT GCATCAACGC CTTGGTCCAC 1200 

TCCCTCATCC TCTTCTGGTT TCCCATGAAA GCTCTGGAGC ATGATACTGT GTTTGACAGT 1260 

GGTCATGCTA COGACTATTT ATTTGTTGGA AATATTGTTT ACACATATGT TGTTGTTACT 1320 

GTTTCTCTGA ARGCTGOTTT GGAGAOCACA GCTTGGACTA AATTCAGTCA TCTGGCTGTC 1380 

TGGGGAAGCA TGCTGACCTG GCTGGTGTTT TTTGGCATCT ACTCGACCAT CTGGCCCACC 1440 

ATTCCCATTG CTCCAGATAT GAGAGGACAG GCAACTATGG TCCTGAGCTC CGCACACTTC 1500 

TGGTTGGGAT TATTTCTGGT TCCTACTGCX: TGTTTGATTC AAGATGTGGC ATGGAGAGCA 1560 

GOCAAGCACA CCTGCAAAAA GACATTGCTG GAGGAGGTGC AGGAGCTGGA AACCAAGTCT 1620 

GGAOTCCTGG GAAAAGCGGT GCTGCGGGAT AGCAATGGAA AGAGGCTGAA CGAGCJGCX3AC 1680 

OGOCTGATCA AGAGGCTGGG CCGGAAGACG CX:CCCGAQGC TGTTCCGGGG CAGCTCCCTG 1740 

CAGCAGGGCG TCCCGCATGG GTATGCTTTT TCTCAAGAAG AACACX3GAGC TGTTAGTCAG 1800 

GAAGAAGTCA TCOGTGCTTA TGACACCACC AAAAAGAAAT CCAGGAAGAA ATAAGACATG 1860 

AATTTTCCTG ACTGATCTTA GGAAAGAGAT TCAGTTTGTT GCACCCAGTG TTAACACATC 1920 

TTTGTCAGAG AAGACTGGCG TCCAAGGCCA AAACACCAGG AAACACATTT CTGTGGCCTT 1980 

AGTTAAGCAG TTTGTTAGTT ACATATTCCC TCGCAAACCT GGAGTGCAGA CCACAGGGGA 2040 

AGCTATCTTT GCCCTCCCAA CTCGTCTGCA GTGCTTAGCC TAACTTTTGT TTATGTCGTT 2100 

ATGAAGCATT CAACTGTGCT CTGTGAGGTC TCAAATTAAA AACATTATGT TTCACCAATA 2160 
AGAAAAAAAA AAAAAAA 



Scq ID NO: 222 Protein sequence: 
Protein Accession #j NP_0S7613 

1 11 21 31 41 SI 

MSVIVRTPSG RLRLYCKGAD NVIFERLSKD SKYKEETLCH LEYFATEGLR TLCVAYADLS 60 

EMEYEBWLKV YQEASTILKD RAQRLBECYE IIEKNLLLLG ATAIEDRWA GVPETIATLL 120 

KAEIKIWVLT GDKQETAINI GYSCRLVSQN MALILLKEDS LDATRAAITQ HCTDIX5NLLG 180 

KENDVALIID GHTLKYALSF EVRRSFLDLA LSCKAVICCR VSPLQKSEIV DWKKRVKAI 240 

TLAIGDGAND VGMIQTAHVG VGISGNBGMQ ATNNSDYAIA QFSYLEKIiLL VHGAWSYNRV 300 

TKCILYCFYK NWLYIIELW FAFVNGPSGQ ILFERWCIGL YNVIPTALPP PTLGIPERSC 360 

TQESMLRFPQ LYKITQNGEG FNTKVFWGHC iNALVHSIiIL FWFPMKALEH DTVFDSGHAT 420 

DYLFVGNIVY TYVWTVCLK AGLETTAWTK FSHUVVWGSM LTWLVPFGIY STXWPTIPIA 480 

PDMRGQATMV LSSAHPWLGL FLVPTACLIE DVAWRAAKHT CKKTLLEBVQ ELETKSRVLG 540 

KAVLRDSNGK RLNERDRLIK RLGRKTPPTL FRGSSLQQGV PHGYAFSQEB HGAVSQEEVI 600 
RAYDTTKKKS RKK 



Seq ID NOj 223 DtiA sequence 
Nucleic Acid Accession 8: BC017001 
Coding sequence: 1-394 

1 11 21 31 41 51 



275 



wo 02/086443 

LcGCTOGGC AGGGCGGGCG CXKGTCGQGG GGOSCCCGAG GGGCCCGGGC OGAGOGGCGG SO 

OGOSCACGGC GGCAGCATCC ACTCGGGCOG OVTCGCCGOG GTGCACAACG TGCOGCTGAG 120 

OGTOCTCATC CGGCCGCTGC aJiXXJU'lGll GGACCCCGCC AAGGTGCAGA GCCTCCrGGA 180 

CACGATCOGG GAGGACCCAG AGAGarTGCC CCCCATCGAT TCAAAGGGGC 240 

CCAGGGAGGT GACTACTTCT ACTCCTTTGG GGGCTGCCAC CGCTACC3CGG CCTACCAGCA 300 

ACTGCAC3C£5A GAGACCATCC COGCCAAGCT TGTCCSWrTCC ACTCTCTCAG ACCTAAGGCTT 360 

GTACXTOGGA GCATCCACAC CAGACTTGCA GTAGCAGCCT CCTTGGCACC TGCTGCCACC 420 

TTCAAGAGCC CAGAAGACAC ACCTGGCCTC CAGCAGGCTG GGCCATGCAG AAGGGATAGC 480 

AGGGGTGCAT TCTCTTTGCA CCTGGCGftCA GGGTCTGACT CTGGGCACXX: CTCTCACC3GG S40 

CTACAAGGCC TTGGACTCAC TGTACAGTGT GGGAGCCCCA GTrCCCACXTT CT6TGACAAT 600 

AGGATCATGG CCTTACCCTT GAAGCATTAC CGAGAAGGAG AACAGAGATG GGCTT^AGA 660 

GCCACGTCCT GCCGGCTCCA AATTCCCAAG GACAAGGATC CCTCTGCATT TTTGTCTATG 720 

SaOCTCTTA TATGGACTAC ATTCAGCTCC AAGGAAAGGA AAACCTTGAT TGCAGTGGTT 780 

TAAACAAACA GAAGATTGTT TTTCCACATA GCATGGATTC TGGAGATGGG TGC3CTAATGG 840 

SaActS GGAGGTAGGG GTCACffrCTT GGATCCTTTT GCCTTAATCT 900 

ScHgLvG GAax;GT<^ GGAAGAGGAA GGGC^ 1020 

ATCAGAAAGT AAAACCTCGT CAGAAGTCTC TTTCCTGCTC TCTCCCTCTG CATATCTTCA 1080 

m^MGCC CTTCGCCCGA GCCAGCTACC ATTGCACCTC TAGCTCCAAA CAAAGCTAAG 1140 
S^S^ SJ^A^iS CATGGCTGA^ 

SgSc^ CTOrroCAAT AAAACTGGGA TCC^ "^0 

S^CCAGTT AGCTTTTGCT GTCTAACAAA ^ "20 
SctS^ TCCCACAATC CTATGGGTTC 

CTGCIGCTCA CACCTCGGAT CCCTCATQGA GCTAAGGTCA GCTGTTACCT CAGCTGGGCC 1440 

SSS^ TACTCACTTG CCTGGCAGGT GACAGGC^ 1500 

SCTTOGTTC TCCTCCATGT GGCCTCTCX^A GCAGGCTAGC TCAGGCTTAT TCACATGATG 1560 

GCrrCAGGAT TCCAAA6RGA GTGAGAGTAG AAGCTGAAAG ACTTCTTGAG TTCTTGGCCT 1620 

^SIcTC^ CT^A^ GTCACTTCTG CTAAGTTCTT 16«0 

^S^S^ SSS^ GATGAGAAAC AGACTACAT^S TCT^^ 1740 

AAGAGCTTGT GGCCATTTTT CACCTATCAC AAATAATTTT GGATGGGTAT TTATTT^ 1800 

AAAGGTATTT CCCTCTTCCC CCTTrCTCTC TGTCTCATGG GGCCTCACTC TGCCAAGTTG 1860 

GAAGGCACTA AGACATTGTC CT6GCCCTCA GGGTCTAGGG GAAGAGGTGT TGGGGCAGGA 1920 

S^Sc QGACCCACTG TAGTAGGAGT GCCTCC^ 1^80 

GOT^GGGT TAGGCCAGGT AGGACATTCC AGAGGGGCTT CTGAAAACCA AGAGTCCCTG 2040 

™G^^ GCAGGCCTTG TTCTCACTOC CCrCTAAGGG 2100 

CTCGGCACTT TTAAGCX:rCA GTTTCTCCAG TTCAATAATA AGGACAAGAO CTTTTCCCAT 2160 

GCATTCTCTT TCCCCGGGAA AGTTGACTGA GGTGACCAGT AATAGAATTG AAAAGGGAGA 2220 

CTCTCTTCAG TGCAATGTGG CATCCTGGAT TGGGTCTTGG AACAAAAACA GGACATTAGT 2280 

GGGAAAATTG GAAATCTGAA AAAAGTCTGA ATrTTAGTTA ATATACCAAT TTCAGTCTCT. 2340 

TGGTTTTGAC AGATGTACCA TGGTGATGTA AGATGTTGAC CTTGGGGTAG GCTGGGT6AA 2400 

GGGTATACavG GAACTCrTTG TACTATCECT GCAACTTCTC TGTAAATCTA GTATCATTCC 2460 
AAAATAAAAG TTTATTTAAT TTAAAAAAAA AAAAAAAAAA AA 



Seg ID NO: 224 Protein sequence; 
Protein Accession ft: AAH17001.1 

I 11 21 31 41 51 

TLGRAGAGRG APEGPGPSGG AQGGSIHSGR lAAVHNVPLS VLIRPLPSVL DPAKVQSLVD 60 
TIREDPDSVP PIDVLWIKGA QGGDYFYSFG GCHRYAASTOQ LQRBTIPAKL VQSTLSDLRV 120 
YLGASTPDW} 

Seq ID NO: 22S DNA sequence 
Kucleic Acid Accession #: NM_02104B 
Coding sequence: 3... 1110 

1 11 21 31 41 51 

ATGCCTCGAG CTCCAAAGCG TCAGCGCTGC ATGCCTGAAG AAGATCTTCA ATCCCAAAGT 
GAGACACAGG GCCTCGAGGG TGCACAGGCT CCCCTGGCT6 TGGAGGAfiQA TGCTTCATCA 

TCCACTTCCA CCAGCTCCTC TTTTCCATCC TCTTTTCCCT CCTCCTCCTC TTCCTCCTCC 180 

TCCTCCTGCT ATCCTCTAAT ACCAAGCACC CCAGAGGAGG TTTCTGCTGA TGATGAGACA 240 

CCAAATCCTC CCCAGAGTGC TCAGATAGCC TGCTCCTCCC CCTCGGTCGT TGCTTCCCTT 300 

CCATTAGATC AATCTGATGA 66GCTCCAGC AGCCAAAAGG AGGA6AGTCC AAGCACCCTA 360 

CAGGTCCroC CAGACAGTGA GTCTTTACCC AGAAGTGA6A TAGATGAAAA GGTGACTGAT 420 

TTGGTGCAOT TTCTGCTCTT CAAGTATCAA ATGAAGGA6C CGATCACAAA GGCAGAAATA 480 

CTGGAGAGTG TCATAAAAAA TTATGAAGAC CACTTCCCTT TGTTGTTTAG TGAAGCCTCC 540 

GAGTGCATGC TGCTCGTCTT TGGCATTGAT GTAAAGGAAG TGGATCCCAC TGGCXaCTCC 600 

TTTGTCCTTG TCACCTCCCT GGGOCTCACC TATGATGGGA TGCTGAGTGA TGTCCAGAGC 660 

ATGCCCAAGA CTGGCATTCT CATACTTATC CTAAGCATAA TCTTCATAGA GGGCTACTGC 720 

ACCCCTGAGG AQGTCATCTG GGAAGCACTG AATATGATGG GGCTGTATGA TGGGATGGAG 780 

CACCTCATTT ATGGGGAGCC CAGGAAGCTG CTCACCCAAG ATrGGGTGCA GGAAAACXAC 840 

CTCGAGTACC GGCAGGTGCC TGGCAGTGAT CCIGCACGGT ATGAGTTTCT OTGGGGTCCA SCO 

AGGGCTCATG CTGAAATTAG GAAGATGAGT CTCCTGAAAT TTTTGGCCAA G6TAAATGGG 960 

AGTGATCCAA GATCCTTCCC ACTGTGGTAT GAGGAGGCTT TGAAAGAT6A GGAAGAGAGA 1020 

GCCCAGGACA GAATTGCCAC CACAGATGAT ACTACTGCCA TGGCCAGTGC AAfiTTCTAGC 1080 
GCTACAGGTA GCTTCTCCTA CCCTGAATAA 

Seq ID KO: 226 Protein sequence: 
Protein Accession #: MP_066386 

1 11 21 31 41 51 

MPHAPKRQRC MPBEDLQSQS ETQGMX5AQA PIAVEEDASS STSTSSSFPS SPPSSSSSSS 60 



60 
120 
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WO 02/086443 

SSCYPLIPST PBEVSftDDBT PNPPQSAQIA CSSPSWASL PIi)QSDBCSS SQKE BSPSTL 
QVLPOSBSLP RSBIOeiCVTD bVQFLLFKXQ MKBPITKABI LESVIKHYB} HFPLLPSEAS 
EOOiVFeiO VKEVDPTCaiS FVLVTSUaiT YDOOiSDVQS MPKIGUiILI LSIIFIEGYC 
TPEEVmSAL VrntSLYOaSB HLIY6EBBKL I.TQI>WVQEIiy I^yBOVPGSD PASYEFLKGP 
SABAEnUOIS LUCFLAXVHG SDraSPPUiy EBALKDEEBR KglKUTTBO TTAMASASSS 
ATGSFSyPB 

Seg ID NO: 227 UNA sequence 

Nucleic Acid Accession t: 1IM_005025.1 

Coding sequence: 82-1314 



PCT/US02/12476 



GOGGAGCACA 
GAGGCTTGAA 
AGTATGGCTA 
TATAATCGTC 
GCTCTTGCAA 
CACTCAATGG 
TCAAACATGG 
GT6CAAAATG 
GCAGCAGTAA 
TGGGTGGAGA 
GCTGCCACTT 
TTTAGGCCTG 
ATTCCAATGA 
GAAGCTGGTG 
ATGCTGGTGC 
CAGCTGGTTG 
AGGTTCACAG 
GAAATTTTCA 
TCCAAAGCAA 
GTCTCAGGAA 
CATCCATTTT 
GTCATGCATC 
TTATTT6AAT 
TAGGATTTGT 
AATATATGTA 
TGTTATGTCA 



11 
I 

GTCCGOOGAG 
ACTGTTACAA 
CAGOGGCCAC 
TTAGAGGCAC 
TGGGAATGAT 
GATATGACAG 
TAACTGCTAA 
GATTTCATGT 
ATCATGTGGA 
ATAACACAAA 
ATCTGGCCCT 
AAAATACTAG 
TGTATCAGCA 
GTATCTACCA 
TGTCCA6ACA 
AAGAAIGGGC 
TGGAACAGGA 
TCAAAGATGC 
TTCACAAGTC 
TGATTGCAAT 
TCTTTCTTAT 
CTGAAACAAT 
AACAAGGAAA 
GTTTTACAGT 
AATTATAAGT 
TTGTGTTTGT 



21 
I 

CACAAGCTCC 
TATGGCTTTC 
TTTCGCTGAG 
TGGT6AA6AT 
GGAACTTGGG 
CCTAAAAAAT 
AGAGAGCCAA 
CAATGAOGAG 
CTTCAGTCAA 
CAATCTGGTG 
CATTAATGCr 
AACCTTTTCT 
AGGAGAATTT 
AGTCCTAGAA 
GGAAGTTOCT 
AAACTCTGT6 
AATTGATTTA 
AAATTTGACA 
CTTCCTAGAG 
TAGTAGGAIG 
CAGAAACAGG 
GAACACAAGT 
ACA6TAACTA 
ATATCTTAAG 
AACTTGTCAA 
GTGCTGTTGT 



31 

! 

ACCATCCCGT 
CTTGGACTCT 
GAAGCCATTG 
GAAAATATTC 
GCCCAAGGAT 
GGTGAAGAAT 
TATGTGATGA 
TTTTTGCAAA 
AATGTAGCCX3 
AAAGATTTGG 
GTCIATTTCA 
TTCACTAAAG 
TATTATGGGG 
ATACCATATG 
CTTGCTACTC 
AAGAAGCAAA 
AAAGATGTTT 
GGCCTCTCTG 
GTTAATGAAG 
GCTGTGCTGT 
AGAACTGGTA 
GGACATGATT 
AGCACATTAT 
ATAATATTTA 
GGAATGTTAT 
TTAAAATAAA 



41 

I 

CAGGGGTTGC 
TCTCTTTGCT 
CT6ACTTGTC 
TCTTCTCTCC 
CTACCCAGAA 
TTTCTTTCTT 
AAATTGCCAA 
TGATGAAAAA 
TGGCCAACTA 
TATCCCCAAG 
AGGGGAACTG 
ATGATGAAAG 
AATTTAGTGA 
AAGGAGATGA 
TGGAGCCATT 
AAGTAGAAGT 
TGAAGGCTCT 
ATAATAAGGA 
AAGGCTCAGA 
ATCCTCAAGT 
CAATTCTATT 
TCGAAGAACT 
GTTTGCAACT 
AAATAGTTCC 
CAGTATTAAG 
AGTAOCTATT 



51 
I 

AGGTGTGTGG 
GGTTCTGCAA 
AGTGAATATG 
ATTGAGTATT 
AGAAATCCX3C 
GAAGGAGTTT 
TTCCTTGTTT 
ATATTTTAAT 
CATCAATAAG 
GGATTTTGAT 
GAAGTCXK»G 
TGAAGTCCAA 
TGGCTCCAAT 
AATAAGCATG 
AGTCAAAGCA 
ATACCTGCCC 
TGGAATAACT 
GATTTTTCTT 
AGCTGCTGCT 
TATTGTCGAC 
CATGGGAOGA 
TTAAGTTACT 
GGTATATATT 
AGATAAAAAC 
CTAATGGTCC 
6AACATGTG 



Seq ID NO: 228 Protein sequence: 
Protein Accession 1IP_OOS01€.1 



MAFIXa^FSX^L 
EIiGAQGSTQK 
NEEPlQirnXK 
IMAVYFKGNW 
VLEIPYEGDE 
IDLKDVLKAL 
SRMAVLYPQV 



11 
I 

VLQSMATGAT 
BIRHSMGYDS 
YPNAAVNHVD 

KSQPRPENTR 
ISMMLVLSRQ 
GITEIPIKDA 
IVDHPFFET.I 



21 
I 

PPEEAIADLS 
LKMGEEFSFL 
FSQNVAVANY 
TFSFTKDDES 
SVPZATIiEPL 
NXiTGLSDNKB 
RNRRTGTILP 



31 
1 

VNMYNRIiRAT 
KEFSNMVTAK 
INKWVENNTN 
EVQIPWKQQ 
VKAQZjVEEWA 
ZPLSXAZHKS 
MGRVMHPETM 



41 
1 

GEDEMILFSP 
ESQYVMKIAN 
NLVKDLVSPR 
GEFYYGEFSD 
MSVKKQKVEV 
FLEVNEEGSE 
NTSGHDFBEL 



SI 
I 

LSIALAMGMH 
SLFVQNGFHV 
DFDAATYLAL 
GSHJEAGGIYQ 
YLPRFTVBQE 
AAAVSGMIAZ 



Seq ID KO: 229 OKA sequence 
Nucleic Acid Accession 1]M_00369S 
coding sequence! 12-398 



1 
1 

OSACATCAGA 
CAGCCCTTAC 
TCTGCCCGGC 
ATCTGGTGAA 
TCAGCAGCX3G 
ACAACGCTGC 
TGAGCCTCCT 
TCATGCCTTT 
GGGTGCCAGG 
CATGGAATGC 
ACAGAGGATG 
GATTTCACAC 
TAAATGATTT 



11 
I 

6ATGAGGACA 
CCTGC6CTGC 
CAGCTCTCGC 
GAAGGACTGT 
CACCAGCTCC 
ACCCACCOGC 
GGCCGTCATC 
CCTTCCCTTT 
AGCCCGAGGC 
TGATGACTTG 
CAGCCCCCAG 
TCCTTCTGTT 
AAACC 



21 
I 

GCATTGCTGC 
CA06TGTGCA 
TTCTGCAAGA 
GCGGAGTCGT 
ACCCAGTGCT 
ACCGCCCTCG 
TTAGCCCCCA 
CTCTGGGGAT 
TGAGGGCTTC 
GAGCAGGOCC 
CTGCATGGAA 
TTGTTGCCGT 



31 

1 

TCCTTGCAGC 
CCAGCTCCAG 
CCACGAACAC 
GCACACCCAG 
GCCAGGAGGA 
CCCACAGTGC 
GCCTGTGACC 
TCCACACCTC 
OCCGAAAGTC 
CACAGACCCC 
GGTGGAGGAC 
TTATTTTGTA 



41 

I 

CCTGGCTGTG 
CAACTGCAAG 
AGTG6AGCCT 
CTACACCCTG 
CCTGTGCAAT 
CCrCAGCCTG 
TTCXXCCCAG 
TCTTCCCCAG 
TGGGACCAGG 
ACAGAGGATG 
AGAAGCCCT6 
CTCAAATCTC 



51 

1 

GCTACAGGGC 
CATTCTGTGG 
CTGAGGGGGA 
CAAGGCCAGG 
GAGAAGCTGC 
GGGCTGGCCC 
GGAAGGCCCC 
CCGGCAAOGG 
TCCAGGTGGG 
AAGCCACCCC 
TGGATCCCCG 
TACATG6AGA 



Seq ID KO: 230 Protein sequence: 
Protein Accession #: NP_003686 

21 



120 
180 
240 
300 
360 



1 11 21 31 41 SI 

I I 1 i i I 

>1RTALLLLAA LAVATGPALT LRCHVCTSSS NCKHSWCPA SSRFCKTTNT VEPLRGNLVK 
KDCAESCTPS YTLQGQVSSG TSSTQCCQED LCNEKIiHNAA PTRTALAHSA LSLGLALSLL 
AVILAPSL 



60 
120 
180 
240 
300 
360 
420 
480 
S40 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 



60 
120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



60 
120 



Seq ID NO: 231 DMA sequence 
83 Nucleic Acid Accession Eos sequence 
Coding sequence: 126-752 



277 



wo 02/086443 

1 11 21 31 41 SI 

1)111 I 

CCX30GC3\OGT GGCTC»TGCT OGG6AGCGIO GTKSVGOGGC TGGOGOGGTT GTCCTGGAGC 60 

AGGGGC3GCAG GAATTCTGAT GTCAAACTAA CAGTCIGTGA GCCCTGGAAC CTCCACTCAG 120 

AGAAGAT6AA GGATATCGAC ATAGGAAAAG AGTATATCAT CCCCAGTOCT GGGTA TAGftA , 180 

GTGTGAGGGA GAGAACCAGC ACTTCTGGGA CGCACAGAGA CCGTGAAGAT TCCAAGTTCA 240 

GGAGAACTOG ACCGTTGGAA TGCCAAGATG CCTTGGAAAC AGCAGCCCGA GCOGAGGGCC 300 

TCrCTCTTGA TGCCTCCATG CATTCTCAGC TCAGAATCCT GGATGAGGAG CATCCCAAGG 360 

GAAAGTACCA TCATGGCTTG AGTGCTCTGA AGCCCATCOG GACTACTTCC AAACACCAGC 420 

ACCCAGTGGA CAATGCTGGG CTTTTTTCCT GTATGACTTT TTOGTGGCTT TCTTCTCTGG 480 

CCCGTCTGGC CCACAAGAAG GGGGAGCTCT CAATGGAAGA OGTGTGGTCT CTGTCCAAGC 540 

AC3GAGTCTTC TGACGTGAAC TGCAGAAGAC TAGAGAGACT GTGGCAAGAA GAGCTGAATG 600 

AAGTTGGGCC AGACGCTGCT TCCCTGCX3AA GGGTTGTGTG GATCTTCTGC OGCACCAGGC 660 

TCATCCTGTC CATOGTGTGC CTGATGATCA CGCAGCTGGC TGGCTTCAGT GGACCAAATT 720 

TTCAGGA'TCG CTGTATTCTG CX3GTCAGAAT GAGAGAGTCA AGCTGGGCAG AATCTCTOSC 780 

CAAGAGTTCA GCCTTCCTTT GGAGACTGCT CCATCAGTGC CGAGGTCTGT GGGAACAGGC 840 

TTCACTGCAC OGCCATCTTA CTGACTTGCT TCACGTGAGG AAAAGGGGGC TTTGGCCCTG 900 

TGACTCAGTT CCACATTTTG GATTGCATAC TGGAAAAGAA GCCAATCTTC TTGCTAGTAA 960 

ACCAGCAACC CGGCTGTATA CAGTGGTGAC CCAAGCAATG GATATAAACC TAAAAATCTG 1020 

AGGGAGGGGA GAGGTGGAAT ACAGrTAGTTC TTGGAATCTG AAGTCTCCTA rTTGATCAGG 1080 

TTATTTCCTG GGACTTGGCA AAAATCTGAT TGGTGGGGAT CTCCTAGGAC CTAGTGGACA 1140 

TCTOGTATTA ATTTAATCTC AiGGAAAAACA AGAAATTAAC CCAGAGAGAG TCTGGCTTTT 1200 

GGAATTCAGC GTAGCTACCT CCAGACCGTG GTGTCTGGCC TCCATTTTTG TCTGTCATTC 1260 

AGCTCTGACT TACAGCTGCA GTCACCTTTG CTATAAGGCA CCTGGGTAGA AG GGTGGA TG 1320 

GGCTTCACAT CAATTTTTTT CTTCCTTTAG GCTG6GGGAT TGGTTTGGCT TTCTTTTGTT 1380 

Q^-j-j-j.^ GTTTTATTTT TGTCAAGATT GATTTTTAGA TGCAAGGACT TGAAAAGACC 1440 

CAGAAGGATC CCACCAGTTT TTCCTTGAGG CCTAGGATTT TTTATTCTGT CCCGAGCAGA 1500 

GGTAATTCCT CACAACTTAG TGCACCAGTA GCACCAGCCA TTTTGAGCAG AGTACCTCTT 1560 

TGGGGftGCTT TrC G ri'TTGT TTTGTTTTTA ATTCTCTTTC CTTAGCAGCA AGGTCTTTTT 1620 

TCCTAGAGAA TCTACTCC3GT TGCA6AATCA TTGCAACXTTC AGGAGCCCTC ACTGATTGAG 1680 

TGCTGTCAGC CTGATATACT ACTTTGGACT CTGGAAACAG ATATGGGTTC TATTCTCTAT 1740 

TTCTACTGTG TCTCGTTAAA CAACCGTCGG AGACCAGATG ACCTGTTAGA TGGCTAGTCC 1800 

TGTATAACTC GACTCTGTAT GTTTCAAIGT ATGTTACTGC AATGCTTCAC CTGCTGTACA 1860 
GTGrrtGTGA GATGCTCTTT GAAGATGGTA CTTTTATATT T 

Seq ID NO: 232 Protein sequence: 
Protein Accession »t Eos sequence 

1 11 21 31 41 51 

t i i i I i 

MKDIDIGKEY IIPSPGVRSV RBRTSTSGTH RDREDSKFRR TRPLECQDAL ETAARAEGLS 60 

LDASMHSQLR ILDEEHPKGK YHHGLSAI*KP IRTTSKHQHP VDNAGLFSCM TFSWLSSLAR 120 

VAHKKGEliSM EDVWSLSKKE SSDVNCRRLE RLWQEELNEV GPEtAASLRRV VMIFCRTRLI 180 
LSIVCmiTQ LAGFSGPNFQ DGCILRSE 

Seq ID NO: 233 DNA sequence 

Nucleic Acid Accession If: CAT cluster 

1 11 21 31 41 SI 

I I I } 1 I 

TTTTAATGGT GCTCATATAT ACTGTATTTT TTGTTGTTTA GTrTTACTTA TTGAGAGTGT 60 

CACAACATGA ATCACATAAT CATGATTTTT TTTTTTTACT TTTACTCCCC AAATTATTCA 120 

TCTTTCTTAG ATCGTAGTCA TTGAGAAGTC CCAATAACTC TAAACTTTTG AGTTATAACG 180 

TAGTAAACTT CTCTTTCATC TTTGTGTTAG CTCTGTAGTC TTAACCTGGA TTTTAATTTT 240 

TTTGTTTCCA AAGTCACAAT TGAATTATTC TTAGATAOCT TAAGC3CACTG AATTCAGTTC 300 

TGTTTGACTG AAAGCAAAAC AACGTGACAG TTTATTTTCA AACACTAACT TCTTGATATT 360 

TTGTTATGGT ATATCTTTTT ATTAAATATT TATTTTGACT AAGCTTTCAT AAAATATTTG 420 

AAGCTATTTT AATCATCAAG TATGGAAAAC AAATTACTAT TGCATTTTCC TATATATGCA 480 

TATATTATGG ATTAACCAGA ATTGTATCAT TTTTGGCCTA ATGTCTGGAT ATAAAAGATA 540 

ATTAGCCTAC TATAGTATTA ATAAATTTTT CAGTTGGTTT GGGCAAATTT A AACCT GAAA 600 

AATAGGTTAA AAAGTAGTTA CAAATTAAAC TTACTAATTT ATACCTGATT TTTTTTCTTG 660 

AATTAAAGTA CATTTTAAAT GAGCTTTATA ATACCTTAAA AAGTTGGTTC TAATTTAAAA 720 

TATGAAAGCT CTQGCTATCA TCCTGGGATA GTAATTTCTA ATTATATAGT ATTTCAAAAC 780 

TATATATTTT TTAGTTCCTT TGAGATAACT AATTTCTAAT TATATATGTT TCAAAAACCA 840 

TATCCTGTAT TTTTTTTAAG AATTGTTTTA TAAATAGGTC ATAA6ATACA AGGTCTGCAT 900 

TAGAAGACCC ACTCTTACTA GGTTCCCTAA GGATCTGCCA TAGATTTTTT TTTTTTTTTT 960 

TTTTTTTTA6 GTAGTTTAAA GCAAGCACTG ATACCAGTGG GAGTTGGTCT TGATCTAGGA 1020 

GATTCTGTTA AGCATCCAAA AACAATGCCT AATTTCAGTT CTTAG6TTAT GGCTTGTGAC 1080 

TCCAGATAAA AGATGGAGAA TACCTCATGT ACTGTGACTT GAAAATGAAT TCTTAAAATT 1140 

CTTAGGCTCT CTCCATGTAT CTTTCTTAAG GAAAAGTTTC TGAGTGTGAT CTCTCTTTTG 1200 

CCATACTATC AAGTGGAGGG TAGTTCAGAA AAGTTAATAG GAAATCTTTT GTGACAGCAG 1260 

ACTATAATAG AAGTTTGAGT AATATTTTAA TAAATTTATA TAATTCAAAT GATAAAAATG 1320' 

TATCAATGTT ATCCAATGAT TTTTATTAAA AAATTACCTT ATTATTAGAA CTGTGCCTAT 1380 

TACATAAAAA GTGCTCATGT ATTTGAATTT TAAATAATTT ATTTAAATCA AGACCACCAT 1440 

AAGTCATTAA TAATTTAATA ATTGTTTTAA ATCAGTGGTT TTCAACCCTC ACTTC ATATT 1500 

AGAATCATCT GAQGACTTTT AATATGGAAT CCAGCTCATA ACAATTAAGT CTAAATTTCT 1560 

GGAAGATGGA GCCATGCTTG TTTTTCCAAA AGCTCTTTGA GTGATTCTAA TTT6TAGTCA 1620 

GAGTTGAAGA CCACTGCTCT AAATTAGTGC AGGAAAATGC TTTTATTTCT CCCATGTTAA 1680 

CTTTTAAAAC TAGTAATGTA CCCAGTTAAG TTTTGATGGT TTAAATTCCA CTAAAGAACA 1740 

TATTCTTCTA ATAACTAGCA TTTATTACAT GAAATTTAAG AGTTTAAGTT CCATCAAACT 1800 

AGCCCTTGTG TAAGATTATT ATTTCTTCTC TATAACTTCA AAATAGATAT TTCATTCAAA 1860 

CTGTTCAGGT GAGAAAACAT AATGGATTTT TTTTTTTTTC CTCTGGAGCT GCCT GTTCAG 1920 

TGAGATGGAG GAGGTGGGCA CATTTAAGGT CAGTTCACTA ACCTATGGTT CAGAGTTCTG 1980 

ATCATATGGA AGTTTGGAAA AGAGAGCTTA TCACAGGTTT GTATGCTGGT GAAT GQATftG 2040 

TTTTAATTCT CACTGTCTCA AAAGAGAATC AGCTCTCCAG CAGTTCTAGA AAAGCTTTGA 2100 

CAATCCCCAA GGGGCAGTGT TACCTTACTC CTTCACTGCT TCTTAGAAGG TAGAA TTAAG 2160 

TTTCTGGAAT TCCACCTACA TGTTTTCTTA TTAACATTCA GAATTGGGAA TATTAATTTT 2220 
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TCCACTGAGT AGTTTTCTGA AATTGGTAAC TT GGAG AGTA 
CAATTTTGTG TTTC?TTTACT TTT ATGTAAA AATTTGATAT 
AAAACCTCAT GCCrTTTCAT TACATCTAAT TTGAACTCTC 
TTTAAAGATG CTTTAATGAA AATSTATTAAG AAAATATA TA 
CTTCAGAAAT CCATATATTT GTCATATTTA Ulil llAGA 
AGATGGTATT TAAAATGAAT GCOCAAAAAT ATCITCTACC 
TTGGAAGCCG CCAGCCATTC ATGTAGAGAG TTTATAAGAA 
ATTTTATATT ACTATGGTAT CTGTGT ACCA TATTTCTAAG 
CTTCTTAAAA CCATAACCTC GCTTGCCTTT TAGTGTTAAA 
ATAGAGATTC TTCTTTTATG AAGAAGAGCT GACGTAATIT 
AAGACATTAA CATAAGTCTC TGAGCAGTGA TACA TTTTCA 
CCACATTAAA CAACXAOGGC AACACTCAGA CTTGGCACTT 
TGTGCCTGGT ATCGCCTCTG GCATAACTTA CAOSAA TOGT 
CCTTCATCAA GCACTTGCCA ACACATTCAC CTCTAACTTG 
ACAACATCTG CAACTCTACC CTATCAACTG CCAACCTAAA 
COOCAAACAC AAAACCACTA AATCATAACC ACCACACACG 
CACACAACCA ACACACCAOG ACCAAACACC CCACCACAAA 
GACAACACAT CACATACACT CACTACCCOC CCATACTCCC 

Seq ID NO: 234 DMA sequence 

Nucleic Acid Accession ft: Eos sequence 

Coding sequence: 27-281 



PCT/OS02/12476 



AAATAAGGTA 
GTGAATTACA 
AACTTCAGTG 
GATTTGTATG 
AACCICCTAA 
TTTGTCCAAA 
AATAATTTAA 
TATTCATTAT 
CACAAAATOC 
ATTACCAGTG 
AACATGAAGA 
TCCTAOGAAT 
CCrOOCTACT 
TACAACCTTA 
GAOCCCCAAC 
CCACACACCA 
CAAGCTAACA 
ACCCACCA - 



TTTTGcrrrT 

CAGTTCTAAT 
CCAGAAGTGC 
TCAGTTTATA 
TTGGATAACT 
AGTTTATCTG 
AATTOIATGC 
TAAATT6GTA 
AACATTGTAT 
CATCTGCACA 
GTGACAACCA 
CCATOCTATA 
TGTCTACGCT 
OCAACTCACC 
ACAACACAAC 
CACACCCACC 
ACCACAAACA 



2280 
2340 
2400 
2460 
2S20 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 



1 
1 

AGCAGGAGGA 
GTCTACCGCG 
TTCCTGCCCC 
GGCGCTCACT 
CAGGGTTCCG 
GAGTTTTTCT 
CAGAAAGAAT 
ATGTTGCAGG 
TTTTCTTCTC 
ATAAAAACTG 
GAXGCCAGGA 
TCAAGCCAAG 
CTCAAACCCG 
GGTCAGAACA 
AGACAGCCTG 
TCTCAGTATC 
CATAAACACA 
G6AAAGGTCT 
TCTTTTGATG 
GCAAAGAAAA 
TTTATTTTTA 
TTTGACCTTG 
QATTAAAACA 
AAATCTCAAG 
AAAAAAAAGC 
GCACTTCTCT 
TCCCATCAAA 
ATAATCCC 



11 

1 

GAGCTGGC3GG 
TAGCAGTTAC 
GTGCTCATTT 
GAAACAGTGT 
ACCAATCCAA 
TTGCTCTGAT 
GCAAGGAGAT 
AAGAGCTAGT 
CACTTGGCAT 
TTCAGCGGTT 
GGAAAGATGC 
CCAACAGGTG 
GGGAAGCCCA 
CAGCTAAGCA 
TGACGTTTCA 
TTACGCCCAG 
TAACAGCAGC 
CCTGTGACTG 
ACTCTATATC 
ATAAAAGACA 
AACTGAATTC 
AAATAATCTT 
TATTAGTAAT 
GCTTTTAAAG 
CCTCCATCTG 
TCTCATTTTC 
GCCAAAGAAA 



21 
1 

GAAGACATGC 
ATCAGACTGA 
GGGGCTGACG 
GTTGCTCCAC 
GAGCCTTGCA 
CTTG6AGACA 
AGACCAAOGT 
CTTTCAGGCT 
ATCAAGAGCC 
CGCCAACAAG 
CAGGGGTAAA 
TTCTGTTTTT 
CTCTAGAACX: 
GATGGCTTGG 
AAAGCAAAAG 
TGACACGATC 
AGCA ATAATT 
TTTTATTTTT 
CAACTCTGAG 
ATTTCCAGTA 
AGCAGAGATT 
TACATTGTAA 
TAATTATTAA 
CATTTGTACA 
ATTCTCATTT 
CACTGTCTCG 
GAAAAGAAAA 



31 

I 

ACOCCTTGAA 
GACACTTCCr 
CCATTTTAGG 
ACOGCCTTGT 
GAAAGCATTA 
TCCCTCTGCC 
GAGATTCTCC 
GGGCTGGTGA 
AGGOGTGGAA 
AAGTGGTAAA 
GTGGGAAAAT 
CATCACAGAA 
CATGCTGGTC 
GTCATCAGGA 
TCCCCTACCA 
TACCCTCAAA 
AAAGATGAGA 
AGGGAAACAG 
GTTTGATTAA 
AGTATCCCAG 
TACATGCATT 
ATTCTTAATG 
AGGAGAATAA 
AATGACTGGA 
TCATTGTCAG 
CAAGCTAGAA 
TTGTTCT G TA 



41 

1 

GACGCAGAGA 
GTTTACAGGA 

CCTCAGOCCA 
TTTGCTTGTT 
ACGTGCTTTT 
TAGTGGAAAC 
TTCATGCACT 
CCTGAGAAAG 
GACTAAAACA 
GTAGCAAAAA 
GGGAACCTGA 
CTAATAAGTG 
ATCCATATCC 
CGTCCATTAC 
GCCAGTGAAG 
ACTTAAAAAA 
TGAGAACAAT 
AGAGGAAGAA 
A6AAATGACC 
TTCGAATTAA 
ACGATGATTA 
ATCAAAACAA 
TTGCAAATAC 
CATTTTTTAA 
TGCAACAACA 
ATTCrCAOSA 
CAGATATATG 



51 
i 

GAGGC06TCT 
GACTATAAAA 

TCTGCACCCA 
GGCGCGCTCT 
CTCTTTGGCA 
ATAAGGAATA 
CAAGAGAAAG 
AATGTCCAQC 
GGAAA1GTTT 
TGG6GATGGA 
AGCCAGGAGG 
GTGCTGAGGA 
CCAAGGCCCT 
ATGCAAAGGA 
CTACCTGATT 
AAAAGGGAAA 
TAAGAAAAAA 
GAATGATTTT 
TTGAACCACA 
TGATTTACTT 
ACATCTGAAA 
G6TTCTCAGT 
AACATTCCTA 
ATTTGAAAAA 
AAAAAGGTAT 
CTACCTTTGA 
ACATTAAAAA 



Seq ID NO: 235 Protein sequence: 
Protein Accession ft: Eos sequence 

1 11 21 



41 



51 



LpLKTOREA VCLPRSSYIR iRHFLPTGDY KIPAPCSFGA DAILGI*SPSA PRRSLKQCVA 
KQUiVLLVGA LSGFRPIQEP CRKH 

Seq ID NO: 236 DNA sequence 
Nucleic Acid Accession ft: NM_00207S 
Coding sequences 4 06.. 14 2 8 




GACTGCATGA 
GCCAGTGCCA 



11 

1 

GGCAGACCTG 
ACCCAGAGGC 
AATCTCAGCT 
GCTGAGGAGC 
GGCCAGCTCC 
GAGGGAGTAA 
CCCAAGAGCC 
AGGAAGGGGA 
CTCTGGCAGA 
GGACGTTAAG 
TGCPGGTAAG 
ACGTGCACGC 
G6AACTTTGT 
GTGAGGGCAA 
GCCGCTTCCT 
GGGACATTGA 
GCCTGGCTGT 
AGCTCTGGGA 



21 
I 

TCCATCCTTC 
AGCTGGTT6G 
CCTGCCTGTA 
ACAGTTTGAG 
TCTGGCAGCA 
GGAGGCTCCC 
AGAGTGACCC 
GCAGCTCAAG 
GCTGGTGTCT 
GGGACACCTG 
TGCCTCGCAA 
CATCCCACTG 
GGCATGTGOG 
TGTCAAGGTC 
GGATGACAAC 
GACTGGGCAG 
GTCTCCTGAC 
TGTGCGAGAG 



31 
I 

TCTGTGGGTC 
GGTTTGTGGA 
CCCTCCCATA 
GCCCCXZCCAA 
GAGCCTGGGC 
AGGAACCGGA 
CTCGACCTGT 
AAGCAGATTG 
GGOCTAGAGG 
6CCAAGATTT 
GATGGGAAGC 
CGCTCCTCCT 
GGGCTGGACA 
AGCCGGGAGC 
AATATTGTGA 
CAGAAGACTG 
TTCAATCTCT 
GGGACCTGCC 



41 

I 

CCCTGTACCT 
GAAGAAGGAT 
CTCACCAAAC 
CCCCCCGC06 
AGGTGACGGG 
GCPGGAAACC 
CAGCCATGGG 
CAGATGCCAG 
7GGTGGGA06 
ACGCCATGCA 
TGATCGTGTG 
GGGTCATGAC 
ACATGTGTTC 
TTTCTGCTCA 
C CAGCTCG GG 
TATTTGTGGG 
TCATTTCGGG 
GTCAGACTTT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 



SI 

I 

TTCTCCCCCA 
TATCCAGATC 
CCTCTTCCCC 
GTOGGGGCCA 
CGGGCGCGGG 
CGGCCGACGT 
GGAGATGGAG 
OAAAGCCTGT 
AGTCCAGATG 
CTGGGCCACT 
GGACAGCTAC 
CTGTGCCTAT 
CATCTACAAC 
CACAGGTTAT 
GGACACCACG 
ACACA0G6GT 
GGOCTGTGAT 
CACTGGCCAC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1060 



279 
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GAGTOGGACA 

GATGACXKrrr 

CACGAGAGOV 
TTOGCTCGCT 
6GCATCCTCT 
GCTGTGGCCA 
AAGGGAAGTG 
GGTGTTCTCT 
GGGAGCATGG 
CXaTCTOCTC 
GGACAACCTG 
GCCCTAGGAT 
TTGGCXXrroT 
TTCTCCTTTT 
GT 



TCAACGCCAT 
CCTGOOGCTT 
TCATCTGOGG 
AOGAOGACTT 
CTGGCCAGGA 
CAGGTTCCTG 
GAAGGCAGTG 
TCTATATTCC 
GACTGTGCCT 
CCATGGCCTT 
CCCCTOCCCA 
TCCTCCCCCA 
GACTATGGCT 
TCTACCTTTT 



CTGTTTCTTC 
GTTTQACCTG 
CATCACGTCC 
C3UICTGCAAT 
TAACAGGGTG 
GGACAGCTTC 
AACACACTCA 
GGGTGCCATT 
TTQGGAGGCA 

cccrccccAC 

GCOCTTTGCA 
GAGGCACTAC 
CTGGCACCAC 
TTTCTCTCCT 



(XCAATGGAG 
OGOGCAGACC 
GTGGCCTTCT 
GTCTGGGACr 
AGCTGCCIGG 
CTCAAAATCT 

ccAGcccxrr 

CCCACTAAGC 
GCATCAGGGA 
AGTCCTCACA 
GGCCCAGCAG 
CTTTGTCCAC 
TAGGGTGCrO 
AAGAGAOCTG 



AGGCCATCTG 
AGGAGCTGAT 
CCCTCAGTGG 
CCATGAAGTC 
GAGTCACAGC 
GGAACTGAGG 
GCCOGAOCCC 
TTTCTCCTTT 
CACAGGGGCA 
GCCTCTCCCT 
ACTTGAGTCT 
GCCTG GGTGG 
GCCCTCTTCT 
CAATAAAGTG 



CACGGGCTCG 
CTGCTTCTCC 
CCGOCTACTA 
TGAGOGTGTG 
TGAOGOGATG 
AOGCTGGAGA 
ATCTCATTCA 
GAGGGCACTG 
AAGAACTGCC 
TAATGAGCAA 
GAGGCCCCAG 
TATAGGGOGT 
TATTCATGCT 
TAGCACCCZG 



1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 



Seq ID NO: 237 Protein sequence: 
Protein Accession ft: NP_002066 

1 11 21 31 41 

I 1 I I I 

MGEMEQUIQE AEQLKKQIAD ARKACADVTL AEI*VSGLEW GRVQMRTRRT 
NSSIATDSXLL VSASQDGKLI VWDSYTTNKV HAIPLHSSWV MTCAYAPSGS 
CSIYNLKSRE GNVKVSSELS AHTGYLSCCR FLDDNNIVTS SGDTTCALWD 
VGHTGDCMSL AVSPDPNLFI SGACa)ASAKL WOVREGTCRQ TFTQJBSDIN 
ICTGSDDASC RLFDUIADQB LICPSHBSII CGITSVAPSL SGRLLFAGTO 
KSERVGILSG HDNRVSCUSV . TAIxaOlVATG SWDSPLKIWN 

Seq 10 NO: 238 DNA sequence 

Nucleic Acid Accession ft: CAT cluster 



51 
I 

LRCaSLAKZYA 
PVAOGGLDNM 
lETGQQKTVF 
AICFFPNGEA 
DFNOJVWDSM 



TCCCAATGTG 
TACXaTTTGC 
ACTGATTCTG 
TGCATTGACC 
TAAGTGACTG 
TTTTGACGAG 
CAAAAAGGGG 
AAGAAAGAAA 
GGATCAAGAA 
CAGAAGT6AA 
TAGAAAAGTT 
CAACTACTCA 
CTTGTGTTCC 
ACTGTTGTTT 
GACATGAAAG 
TAGGCTAAGT 
AAAATCCCAA 



11 

I 

TNGAACCTAC 
TTTTAAGGCA 
TGAATTATGA 
AGTGTGAAGC 
GAAAGCTGAA 
TATCGGGTGA 
AAAAAAAA6A 
AATAAAATAC 
GTTTGTGTAC 
TCAAAATATT 
TTTCTGTAAA 
ACTTTCCTAC 
AATAAAGCTT 
GCCAAGTCCT 
TTCATTGGGT 
TATAATACAC 
ATAAAA 



21 
I 

CATAAATTCT 
GATAATCCTC 
AATCTGAAAA 
ACAGTGGAAT 
GAATCACCGG 
CTTTGAG6T6 
GCAACCAAAO 
ACAATATGGA 
ACATAATCTC 
TCAAAATGCT 
AGTCAGATAG 
TGTAGCACAA 
CATTTACAAA 
AATATAGTTG 
TGCTAAAAAG 
TGTTTTAACA 



31 
I 

TTTCTTAaJG 
CAAGTTTTCr 
GGAATTGGAA 
GAGAATGCGT 
CTTCAGTGAC 
GTCAAfiAAAC 
AAAAAAAATC 
CGATGGAGAA 
ATTTTGAGAT 
GTCTTATGAA 
TAAATATTTT 
GAGTAGCTGT 
AACATGCCAT 
CTTAGCAAGT 
TATGTA6AAA 
ATTOTAAAAT 



41 
I 

GACAAICTTA 
AATGATATCT 
GTTGCTAAAA 
GCCCTGACAC 
ATGGAAGCCA 
CACACTTTAA 
CATAAAATTG 
AAACAGTTAC 
ATATAACTAT 
ACTACAATAT 
AGGTTTTGCA 
GGTACTGTGC 
GOGCCATATT 
ATTGTGAGCT 
TTCAAAGGAA 
GTAAGAGAAA 



51 
I 

TNCTAANCAA 
GAAACTATTA 
ATCTATCATT 
CAAAGAAAAA 
GTGATTTGAT 
GAACAATGTC 
CACAGAA6AA 
ATTTCTTTAT 
TTTTGTCTTT 
TCTCACAGAT 
GTGTCTTTTG 
AAATAAATTG 
TGGCCTGTAC 
ATTTGAGGAA 
AATTAAAATT 
TTTACAAATA 



Seq ID NO: 239 DNA sequence 

Nucleic Acid Accession NM_0017S6.1 

Coding sequence: 130-1023 



1 

i 

GG0GGGGGG6 
GTTGTTGTAG 
TGACTAACTA 
GTGTATAAGG 
GAAAGTGAAG 
CTTCGTCATC 
CTCATCTTTG 
CAGTACAT6G 
TTTTGTCACT 
GACAAAGGAA 
AGAGTATATA 
TCAGCTCGTT 
GCAA CTAAG A 
AGAGCTTTGG 
AAGAATACAT 
GAAAATGGCT 
GGCAAAATGG 
TAGCTTTCTG 
AACTCTTGTC 
GTCTTCTAAT 
ATTCTGTAAA 



11 
I 

GGCACTTGGC 

CTGCCGCTGC 
TGGAAGATTA 
GTAGACACAA 
AGGAAGGGGT 
CAAATATAGT 
AGT TTCT TTC 

attcttcact 

CTAGAAGAGT 
CAATTAAACT 
CACATGAGGT 
ACTCAACTCC 
AACCACTTTT 
GCACTCCCAA 
TTCCCAAATG 
TGGATTTGCT 
CACTGAATCA 
ACAAAAAGTT 
TATTTTTGTC 
TTCAAAAATA 
TGTGAAAAAA 



21 
1 

TTCAAAGCTG 
GGCCGCCGCG 
TACCAAAATA 
AACTACAGGT 
TCCTAGTACT 
CAGTCTTCAG 
CATGGATCTG 
TGTTAAGAGT 
TCTTCACAGA 
GGCTGATTTT 
AGTAACACTC 
AGTTGACATT 
CCATG6GGAT 
TAATGAAGTG 
GAAACCAG6A 
CTC6AAAATG 
TCCATATTTT 
TCCATATGTT 
TTATATATAT 
TAACTTAAAA 
AAAAAAAAAA 



31 
I 

GCTCTTGGAA 
GAATAATAAG 
GAGAAAATT6 
CAAGTGGTAG 
GCAATTCGGG 
GATGTGCTTA 
AAGAAATACT 
TATTTATACC 
GACTTAAAAC 
GGCCTTGCCA 
TGGTACAGAT 
TGGAGTATAG 
TCAGAAATTG 
TGGCCAGAAG 
AGCCTAGCAT 
TTAATCTATG 
AATGATTTGG 
ATGTCAACAG 
TTCTTTGTTA 
ATGTAAATAT 
AAAAA 



41 - 
( 

ATTGAGCGGA 
CCGGGATCTA 
GAGAAGGTAC 
CCATGAAAAA 
AAATTTCTCT 
TGCAGGATTC 
TGGATTCTAT 
AAATCCTACA 
CTCAAAATCT 
GAGCTTTTGG 
CTCCAGAAGT 
GCACCATATT 
ATCAACTCTT 
TGGAATCTTT 
CCCATGTCAA 
ATCCAGCCAA 
ACAATCAGAT 
ATAGTTGTGT 
TCAAACTTCA 
TCTATATGAA 



51 
1 

GAGGGACGCG 
CCATACCCAT 
CTATGGAGTT 
AATCA6ACTA 
ATTAAAGGAA 
CAGGTTATAT 
CCCTCCTGGT 
GGGGATTGTG 
CTTGATTGAT 
AATACCTATC 
ATTGCTGGGG 
TGCTGAACTA 
CAGGATTTTC 
ACAGGACTAT 
AAACTTGGAT 
AOGAATTTCT 
TAAGAAGATG 
TTTTATTGTT 
GCTGTACTTC 
TTTAAATATA 



Seq ID NO: 240 Protein sequence: 
Protein Accession 8: NP_001777.1 

21 



60 
120 
IdO 
240 
300 



60 
120 
160 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
108O 
1140 
1200 



51 



11 21 31 41 

I I I I I I 

KEDYTKIEKI GBGTVGWYK GRHKTTGQW .AMKKIRIiESE EEGVPSTAIR EISZiLKBLRH 
PHIVSLQDVL KQDSRLYLIP EFLSMDLKKY LDSIPPGQYM DSSLVKSYLY QILQGIVFCH 



60 
120 



280 



wo 02/086443 

SSRVLHRDLK PQNLLIDORG TIKLADPGIA RAFGIPIRVY THEWTLWYR SPEVLMSAR 
ySTPVDIKSI GTIFAEXATK KPLFEGDSEI DQXiFRIFBAL GTPHSBVHPE VESIXPYKNT 
FPKHKPGSLA SRVXNLOQiG UlLLSKMLIY OPAKRZS6SM AUIEPYFNZXli OKQXSKH 



PCT/US02/12476 



160 
240 



Seq ID NO: 241 OKA sequence 

Kucleic Acid Accession NM_033379.l 

Coding sequence: 132-854 
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CGCCCGCGCG 
GCTTTGCAGA 
ATTGACTAAC 
TTGTGTATAA 
TAGAAAGTGA 
AACTTCGTCA 
ATCTCATCTT 
GTCAGTACAT 
TATTGCTGGG 
TTGCTGAACT 
TCAGGATTTT 
TACACGACTA 
AAAACTTGGA 
AAOGAATTTC 
TTAAGAAGAT 
TTTTTATTGT 
AGCTGTACTT 
ATTTAAATAT 



11 

I 

CGGGCTCAAC 
GAGOGCCCTC 
TATGGAAGAT 
GGGTAGACAC 
AGAGGAAGGG 
TCCAAATATA 
TGAGTTTCTT 
GGATTCTTCA 
GTCAGCTCGT 
AGCAACTAAG 
CAGAGCTTTG 
TAAGAATACA 
TGAAAATGGC 
TGGCAAAATG 
GTAGCTTTCT 
TAACrCTTGT 
C6TCTTCTAA 
AATTCTGTAA 



21 

-1 

TTTGTAGAGC 
CAGGGACTAT 
TATACCAAAA 
AAAACTACAG 
GTTCCTAGTA 
GTCAGTCTTC 
TCCATGGATC 
CTTGTTAAGG 
TACTCAACrC 
AAACCACTTT 
GGCACTCCCA 
TTTCCCAAAT 
TTGGATTTGC 
GCACTGAATC 
GACAAAAAGT 
CTATTTTTGT 
TTTCAAAAAT 
ATGT6AAAAA 



31 
I 

GAGGGGCCAA 
GCGTGCGGGG 
TAGAGAAAAT 
GTCAAGTGGT 
CTGCAATTOG 
AiGGATGTGCT 
TGAAGAAATA 
TAGTAACACT 
CAGTTGACAT 
TCCATGGGGA 
ATAATGAA6T 
GGAAACCA6G 
TCTOGAAAAT 
ATCCATATTT 
TTCCATATGT 
CTTATATATA 
ATAACTTAAA 
AAAAAAAAAA 



41 
I 

CTTGGCAGAG 
ACACGGGATC 
TGGAGAAGGT 
AGCCATGAAA 
GGAAATTTCT 
TAT6CAGGAT 
CTTGGATTCT 
CTGGTACAGA 
TTGGAGTATA 
TTCAGAAATT 
GTGGCCAGAA 
AAGCCTAGCA 
GTTAATCTAT 
TAATGATTTG 
TATGTCAACA 
TTTCnTGTT 
AATGTAAATA 



51 
I 

OGG60GGGCA 
TACCCATACC 
AOCTATGGAG 
AAAATCAGAC 
CTATTAAAGG 
TCCAGGTTAT 
ATCCXrrCCTG 
TCTCCAGAAC 
GGC3VCCATAT 
GATCAACTCT 
GTGGAATCTT 
TCCCATGTCA 
GATCCAGCCA 
GACAATCAGA 
GATAGTTGTG 
ATCAAACTTC 
TTCTATATGA 



Seq ID NO: 242 Protein sequence: 
Protein Accession 8: NP_203698.3. 

1 11 21 31 41 51 

I 1 I I I I 

MEOYTKIEKI GEGTYGWYK GRHKTTGQW AMKKIRLESE EEGVPSTAIR EISLLKELBH 
PNIVSLQDVL MQDSRLYLIF EFLSMDUCKY LDSIPPGOYM DSSLVKWTI, WYRSPEVLLG 
SARYSTPVDI WSIGTIPAEL ATKKPLFHGD SEIDQLPRIF RALGTPNNEV WPEVESLQDY 
KNTFPKHXPQ SLASKVKMLD OIGLDIiLSKM LIYDPAKRZS GKMAIiNHPYP NDLDKQIKKM 

Seq ID t70: 243 DNA sequence 

Nucleic Acid Accession 1: AP101051.1 

Coding sequence: 221-656 



GAGCAACCTC 
CGACCCAGAG 
GCGGGGCCCA 
ACCTGCCACC 
GCTGTTGGGC 
GCCCCAGT66 
CGAGGGGCTG 
TGACTCCTTG 
CATCCTCCTG 
CTTGGAAGAC 
TCTTGCAGGT 
ATTCTATGAC 
TGGCTGGGCT 
COGAAAAACA 
GAAAGACTAC 
GGACATTGAG 
GTATGGTATT 
AAACATGGCT 
TTGTATTACT 
TATATATAGA 
CTCATTATGT 
CCATATTGAT 
CAGTCAAATA 
CTAATTTACC 
TTATTTTTTA 
TTTCATTGGT 
AGCCAAGAAG 
GTGATAAATT 
TTTGCTTTGA 
CACAACTTTA 
ACCmTTGT 
TATATCTTCC 
GATAATCTGG 
TCTTTTTTCT 
AATATTAATT 
TTTATTTGCT 
CTTCATGTGA 
ACACATACCT 
AAACCTACGC 
ATTCTTTCAG 
TTTOCAGTCT 



11 
I 

AGCTTCTAGT 
CTTCTOCAGC 
GCCACCTTCG 
CCTGAGCCAG 
TTCATTCTCG 
AGGATTTACT 
TGGATGTCCT 
CTGAATCTGA 
GGAGTGATAG 
GATGAGGTGC 
CTGGCTATTT 
CCTATGACCC 
GCTGCTTCTC 
ACCTCTTACC 
6TGTGACACA 
ATACTATCAT 
ACAAAACAAA 
TAATCTTATT 
GCTTCCCATT 
TATGTATATA 
TGATACTAGC 
GAAGATGTTT 
TCATTTACTC 
AAGGATGAAT 
CCATAATCTT 
CTCTATCTCC 
AATTTATTAC 
CCTGTTGACC 
AAATATTTGT 
TTGATTGAAT 
TCCCCATTCC 
TAATAAGGTG 
TGACAAATAT 
ATCTGCCAAA 
AGTTTATATT 
CAGCTGGCTG 
TTCACTGCCT 
TCATGTGGTT 
ACATACCTTC 
CTGTGTCTGA 
GTACAGAATG 



21 
I 

ATCCAGACTC 
GGCGGCGCAG 
GGAGTCCGGG 
CGCGGGCX3CC 
CCTTCCTGGG 
CCTATGCCGG 
GCGTGTCGCA 
GCAGCACATT 
CAATCTTTGT 
AGAAGATGAG 
TAGTTGCCAC 
CAGTCAATGC 
TCTGCCTTCT 
CAACACGAAG 
6AGGCAAAAG 
TAACATTAGG 
CAAACAAACA 
TTATCTTCTT 
GAGTAATCAT 
TACATGTTTT 
ATACTTAAAA 
ATTGGTATAT 
TTCTTCATTA 
TCTTTCAATT 
ATAGCACTTG 
TGAATCTAAC 
AAATCAGAAC 
TTCCCACACA 
CCAATTGAGT 
TTTTAAGCTA 
TTAATTGTAT 

TCTCTCTGTA 
TTGAGATAAT 
ACTCTCATTC 
AGACACTGAA 
TCCTCTCTCT 
CAGTGCCTTC 
ATGTGGCTCA 
CATGTTTGTG 
CTATTTCACT 



31 
I 

CAGCGCCGCC 
CGAGCAGGGC 
TTGCCCACCT 
CGAGCX3AGTC 
ATGGATCGGC 
CXjACAACATC 
GAGCACCGGG 
GCAAGCAACC 
GGCCACCGTT 
GATGGCTGTC 
AGCATGGTAT 
CAGGTACGAA 
GGGAGGTGCC 
GCCCTATCCA 
GAGAAAATCA 
ACCTTAGAAT 
AAAAACCCAT 
TCCTCAATAT 
ACTCAAATGG 
TCTATTAAAA 
TATCTCTAAA 
TTTCTTTTTC 
GCTTT6GGT6 
CTTCATGCGT 
CATCGTTATT 
ACATTTCATA 
TTTG6AGGCA 
ATCCCTGTAC 
AGCTGCATGC 
CTTATTCATA 
TGTTTTCCCA 
GTCTGAACAA 
GCTGTAAGCA 
GATACTTAAC 
TTTGAACATG 
GAAGTCACTG 
ACCAGTCTAT 
CTCTCTCTAC 
GTGCCTTCCr 
CTCTGTTCCA 
TGAGCAAGAT 



41 
I 

CCGGGCGCGG 
TCCCCGCCTT 
GCAAACTCTC 
ATGGCCAACG 
GCCATOGTCA 
GTGACCGCCC 
CAGATCCAGT 
OGTGCCTTGA 
G6CATGAAGT 
ATTGGGGGTG 
GGCAATAGAA 
TTTGGTCAGG 
CTACTTTGCT 
AAACCTGCAC 
TGTTGAAACA 
TTTGGGTATT 
GTGTTAAAAT 
AGGAGG6AAG 
GGGAAGGGGT 
ATAGACAGTA 
ATAGGTAAAT 
GTOCTTATAT 
OCTTTGCCAC 
GCCCTTTTCA 
AAGCCCTTAT 
GCCTACATTT 
AATCTTTCTG 
TCT6A0CCAT 
TGTTCCCCCA 
GTTTTATATC 
AGTGTAATTA 
AGTGCTAGAC 
AGTCACTTAA 
CAGTTAGAAG 
AACTATGGCT 
AACAAAACCT 
TTCCACTGAA 
CAGTCTATTT 
CTCTCTACCA 

TrrrAACAAC 

GAT6TATGGA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 



60 
120 
180 



51 
I 

ACCCCAACCC 
AACTTCCTCC 
OGCCTTCTGC 
C6G06CTGCA 
GCACTGCCCT 
AGGCCATGTA 
GCAAAGTCTT 
TGGTGGTTGG 
GTATGAAGTG 
CXSATATTTCT 
TCGTTCAAGA 
CTCTCTTCAC 
GTTCCTGTCC 
CTTCCAGCGG 
AACOGAAAAT 
GTAATCTGAA 
ACTCAGTGCT 
ATTTTACCAT 
GCTCCTTAAA 
AAATACTATT 
GTATTTAATT 
ACATATGTAA 
AAGACCTAGC 
TATACTTATT 
TTGTTTTGTG 
TAGTTTCTAA 
CATGACCAAA 
AGCACTCTTG 
GGTGTTGTAA 
GCGCTAAACT 
TCATGOGTTT 
TTTCTGGAGT 
TCTTTCTACC 
AGGTAGTGTG 
ATGTAGTGTC 
AGACACGTAC 
CAAAACCTAC 
CCACTGAACA 
GTCTATTTCC 
TGCTCTTACT 
AAGGGIGTTG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
• 660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 



281 



wo 02/086443 

GCACTGGTGT CTGGAGACCT GGATTTGAGT CTTGGTGCTA TCAA TCRCCG 'fClVlt JmtS 2S20 

AGCAAOGCAT TTGGCTGCTG TAAGCTTArP GCTTCATCTG TAAGOGGTGG TTTCTAATTC 2S80 

CTCATCTTCC CACCTCACAG TGATGTTGTG GGGATCCAGT GAGATAGAAT ACATGTAAGT 2640 

CTGGTTTTGT AATTTGAAAA GTGCTATACT AAG6GAAA6A ATTGAGGAAT TAACTGCATA 2700 

CUr n 'i'bGlG TTGCTTTTCA AATGTTTGAA AATAAAAAAA TGTTAAGAAA TOQC?rTTCrT 2760 

GCCTTAACCA GTCTCTCAAG TGATGAGAOV GTGAAGTAAA ATTGAG TQCA CTAAAO GAAT 2820 

AAGATTCTGA GGAAGTCTTA TCTTCTGCAG TGAGTATGGC CXSVATGCTTT CTGTG6CTAA 2880 

ACAGIVTGTAA TGGGAAGAAA TAAAAGCCTA CGTGTTGGTA AATCCAACAG CAAGGGAGAT 2940 

TTTTCAATCA TAATAACTCA TAAGGTrGCTA TCTGTTCAGT GATGCCCTCA GAGCTCrTGC 3000 

TOTTAGCTGG CAGCTGACGC TGCTAGGATA GTTACTTTGG AAATQGTACT TCATAATAAA 3060 

CTACACAAGG AAAGTCAGCC ACCGTGTCTT ATGAGGAATT GGAOCTAATA AATTTTAGTG 3120 

TGCCTTCCAA ACCTGAGAAT ATATGCTTTT G6AAGTTAAA ATTTAAATG6 CTTTTGCCAC 3180 

ATACATAGAT CTTCATGATG TGTGAGTGTA ATTCCATGTG GATATCAGTT ACCAAACATT 3240 

ACAAAAAAAT TTTATGGCCC AAAATGACCA ACGAAATTGT TACAATAGAA TTTAT CCAAT 3300 

TTTGATCTTT TTATATTCTT CTACXACACC TGGAAACAgA CCR ATitfSACa TTTTGGGGTT 3360 

TTATAATGGG AATTTGTATA AAGCATTACT CTTTTTCAAT AAATTGTTTT TTAATTTAAA 3420 
AAAAGGAAAA AAAAAAAAAA AAA 



Seq ID NO; 244 Protein sequence: 
Protein Accession ft: AAD16433.1 



1 11 21 

i I I 

MANAGLQLLG FILAPLGWIG AIVSTALPQW 
QIQCKVFDSL LNLSSTLQAT RALMWGILL 
IGGAIFLLAG LAILVATAWY GNRIVQEFYD 
LLCCSCPRKT TSYPTPRPYP KPAPSSGXDY 



31 41 SI 

i I 1 

RlYS^Maatl VTAQAHySGL WMSCVSQSTG 60 

GVIAIFVATV GMKCMKOiED DEVQXMRMAV 120 

PMTPVHARYB FGOAIiFTGWA AASLCLLGGA .180 

V 



Seq ID NO: 245 DNA sequence 

Nucleic Acid Accession #s CAT cluster 

I 11 21 31 41 51 

I I i 1 I 1 

rrr m 'r m rnTrm ' TT tttttcaagg agagcacaag gaactttatt aa tgac tttc eo 

TTAATGGTTA AATGCTCTTT ACCAAGTGAC CCAGAGGCAG CGTGGTTTAG TGGTTTCAAC 120 

AGCATGGTCC CGAGAGTCTG ACAAACCTCA GTTCAAAIGC TTCTTTTGTC TT CACTT AGT 180 

TTTTCTTCCT GAGATTTAGT TTCTTCATCG TTAACAATGA GGATATTAAT ATGTTTCACA 240 

CAGTTGTTAT GAAGAATGCA TATATTAGAA T6CCTGTAGT CTCAGCTACT CAGGAGGCTA 3 00 

AGGTGGGGAG GTCGCTCAAG CCCAGGAATT CAAAGCTGCA ATGCATTATG ATTACAGCTG 360 

TTAATAGCCA CTGCACTTCA GCCTGGGCAA TGTAGTAAGA TCCCATCTCT GGCTCGGAGG 420 
GTCCTACGOC CAOGGAGTCT CGCTGATT6C TAGCACAGCA GTCTGAGATC AAACTGCA 



Seq ID NO: 246 DNA sequence 

Nucleic Acid Accession XM_058553.2 

Coding sequence: 897-1400 

1 11 21 31 41 51 
1 I I I t I 

AATTTTCAGA AGTTTCGTAT GGGGATGGTT TTATATAAAT TCAGGTTTTT CCCACAATAA 60 

TAAATGTATT TAGTCTCAGT GCTCAATAGA AGAGATTTCT AATAGAAAAG GATTCAAACT 120 

GTGAAACCAT TTCTCTTTTA ATGTTrCACA TTCCTGTTAC AGATTT GTTC TCTTGTGACT 180 

CTGTTATCCA TAATATGGAC AGTTCTTGAG TCCTAACATT GAGA GGTTT T CCCTTAGTGC 240 

ATAGAX3GGAA TGAGTATTAA TTGGAGAAGC TTAAAGTATT GOCACTTTAG CACTGAAGAT 300 

TGGGATGAGA GGAGGTGAAA CCTCACTAGA AAAA06GACA ATGTTAGTGT GGCCCTTCCT 360 

GATCATGTTT AAGAAAAGTC ATGAAAATGG T6AACTAGT6 TTTCCAAGCA TATTGGAAGG 420 

GTTGAGTGTA TACTGTCTGT CAAAGACTTC CAGCATTTCC AGGTCCTAGA GAGGAACAAG 480 

ACTGGTAACC TGCCTATCTG TATTTTTAAG AACCCAGGAG GAAAGCTTTA TAATAGAACA 540 

TTATTTCTGT GTTTATGTAT AAGGGGTTTT TTGTTTTTTT AAAGACAGGA TCTCACTCCA 600 

TTGTCCAGGC CAAGTGCAAT GGCACGAACC TCATAGCTCC TGGACTTAAG TGATCTGCCT 660 

GCCTTTGCCT CCTGAGTAGC TGGGACTACA GGCATGAGCC CCCA TGCCTG GCTAAGTTTG 720 

■ rrrrm ' G T T tgtttgtttg tttgtttttg gggggggttg ttttgttttt tgtagagacg 78 o 

TAGTCTTGCT TTGTTGCCAG GCTAGTCTCA AACTOCIGGC TTCAAGTGAT CCTCCTGCCT 840 

CAGCCTCCCA GAGTGCTAGG ATTACAGCAC TTGGATTCAG CTTCTTCATT TCCAACATGG 900 

AAGAAACTTA CACCGACTCC CTGGACCCTG AGAAGCTATT GCAATGCCCC TATGACAAAA 960 

ACCATCAAAT CAGGGCTTGC AGGTTTCCTT ATCATCTTAT CAAGTGCAGA AAGAATCATC 1020 

CTGATGTTGC AAGCAAATTG GCTACTTGTC CCTTCAATGC TCGCCACCAG GTTCCTCGAG 1080 

CTGAAATTAG TCATCATATC TCAAGCTGTG ATGACA6AAG TTGTATTGAG CAAGATGTTG 1140 

TCAACCAAAC CAGGAGCCTT ACACRAGAGA CTCTGGCIGA GAGCACTTGG CAGTGCCCTC 1200 

CTTGCGATGA AGACTGGGAT AAAGATTTGT GGGAGCAGAC CAGCACCCCA rrTGTCTGGG 1260 

GCACAACTCA CTACTCTGAC AACAACAGCC CTGCGAGCAA CATAGTTACA GAACATAAGA 1320 

ATAACCTGGC TTCAGGCATG OGAGTTCCCA AATCTCTGCC GTATGTTCTG CCATGGAAAA 1380 

ACAATGGAAA TGCACAGTAA CTGAATACCT ATCTCATCAA ATGCCAGACC CTAGAAGACT 1440 

GTTGCTTCTT CTTCTACCAG TGGGTTCTCA TTTTCCTCCT AATCTAATTA TAGAATGGTA IS 00 

AACTCCCTGT GACTTTCCAA ACTGACAAGC ACACTTTTTT CC TCCC CCCT TGAATCCTCA 1560 
TTTAATGCAA GAAOCCTCAT ACTCAGAAGC TTCCAAATAA ACCTTTGATA CAGATTG 



Seq ID NO: 247 Protein sequence: 
Protein Accession fi: XP_058553.1 

1 11 21 31 41 51 

i I ! 1 1 ! 

MEBTYTDSLD PEKLLQCPYD KNHQIRACRF PYHLIKCRKN HPDVASKLAT CPFNARHQVP 60 
RABISHHISS CDDRSCIEQD WNQTRSIiRQ ETIiAESTWQC PPCDEDWDKD LWEQTSTPFV 120 
WGTTHYSDNN SPASNIVTEH XMNLASGMRV PKSLPWLPH XNNGNAQ 



282 



wo 02/086443 

Seq ID HO: 248 OIOV sequence 
Nucleic Acid Accession S: KM_O0J392 
Coding sequence: 7S8..1B5S 



PCT/OS02/12476 



10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 

85 



1 
I 

TTAAGGAAAT 
AACTGATTAT 
CGCGCTGGTC 
CATCCTCCAC 
TCCACTCGCC 
GGCCCAGGTT 
GAGGGACTGA 
GGAAGGAGGC 
TCX5CTGGCGT 
TGGGTGGATT 
GAGAAGG6CA 
TGTATGCTTG 
GTCCATTGGA 
CAAGTTCTTC 
CAATTCTTGG 
AGGAGCACAG 
CCACTTCTAT 
ATGCCAGTAT 
TTTTGGCA06 
AGCAGGGGTG 
CTGCAGCCGC 
OGACAACATC 
GCGCATCCAC 
OGAGGCGGGC 
GTCCGGCTCA 
TGATGCCCTG 
GTTGGTACAG 
CCCCAGCCCT 
CCTGTGCAAC 
GTAOGACCAG 
CTACGTCAAG 
GCCACCCAGC 
TGGTTTTTGG 
TTTTTTCCTG 
GGCAATAATG 
TACAAGACTT 
CTGTGTGGGA 
TGCCATCATA 
TTCAGCTTCT 
AAAACAAAAC 
CATTTTCAAA 
GGTATATCAC 
AATAGCTCAT 
CTCTTATGTC 
AA6ACCCX:CA 
ATGAAATATC 
TACATGAATC 
GCACTGCACC 
ACACTGA8CC 
GCAGCTCCAC 
TGGAAAACA6 
TACSTTTTCTA 
TATCACTGTT 
GTGTACCTTA 
GGTTTAATGG 
ATATATAAAT 
CTCTGGGGTT 
ATTCCAAAA6 
GCAOGACGAA 
TGTTGGGTTG 
TTCTGTTCAC 
ATTCAAAACT 
AACCCATGCC 
CACTACATAG 
CAGGCCATCA 
ACATCTTTTC 
CTCTTAATTT 
ATAATGATAT 
AAAGCACTAA 
.TACTTTTTTT 
GTTGAGTTTA 
CATTCA GATA 
GTCTTGGGTG 
AATG6AAGAT 



TTCACTACTT 
TGTTTTAATG 
ATGATCCTGT 
AAACTGTTCC 
TGCCTGATAT 
ATAAATATAA 
ATCTCTCTGT 
TTTTTTC3AGT 
GCAACCTOGT 
AAGATATCTT 
TTTGTGGAGA 
CAGAAGCATC 
TATTAGAAAT 
ATAGCTTTTT 
AATATGTTCT 
ATACCXXCCC 
ATTGCATAAT 
TCACATCCCC 
TTAGTTTAAA 
ATTTGCTAAA 
ACAATCCTAG 
TTATGTATAT 
ATTTGTATAT 
AGAATATAAA 



21 
I 

CTTOCCCATC 
ATGTTAATTC 
CGCCCCCCAC 
GGCCACCCX33 
CTCTOGCCCA 
0GGAGGGTG6 
AACTAGAGAG 
ACCAGGGCTT 
CAGGATCCCA 
AAGAAACTGC 
AGTAAACTTA 
GAGAGGGAAT 
CAGGAGTTGC 
TGGOCATATT 
CTATGAATAA 
GCCAACTGGC 
TGCAGTACAT 
ATCGACXWTG 
TAGGCAGCGG 
TGAGC0GG6C 
: CCAAGGACCT 
ACCGCTTTGC 
CCTACGAGAG 
TGTACAACCT 
AGACATGCTG 
ACGACAGCGC 
GCTTCAACTC 
TGCGCAATGA 
AGGGCATGGA 
TGCAGACGGA 
GCACGGAGAT 
GCTCCCAGGA 
TATTTTTTAT 
AGAACTCTGT 
CCACGAAAAA 
GTATAGAATG 
ATCCAGAAGG 
AGGTTCCAGT 
TGAGTTGTAA 
AAACCTCCCT 
ACAATGGAAG 
CTCCTCAAAT 
CAGCAGCGAG 
TTTGAAGCTG 
ACACTAGATT 
TTAGGGATAC 

gtttctcagc 

AOCTATTTGA 
CCTCCGTGTT 
TTGGTTGTAG 
AGGGATTTTT 
ACAGAACTTG 
GTTTAGATTA 
CAGTGTACTT 
CTCAAAGTCT 
ATATATCTCA 
CTAGAGCATT 
CTTGAGCTTG 
TTCTGAGGAA 
TTTTTCTTTT 
GGGCATTACT 
AGCAATGTTT 
GACAGTACTT 

■ m ' ri ' T ' m i 

TATCTCAGAC 
TTAGGAGGTT 
GATATCCACA 
TCAGTTGCAG 
ATGTCACTTT 
TCAGATTGTT 
CTTTTAAAAG 
CTTCTAGCCT 
TTCACTGGTT 
ATAAAAOGTT 



31 

i 

TGGAAGTG6C 
GGAGCTGCAT 
CCCCTGCCCT 
CCrCCTTGGC 
TGGAATTAAT 
OCGCAGCQGG 
GGGTCAGGGG 
TGACTCAACA 
GOGAAAATCA 
CTATATCTTG 
AGAGACCrCC 
AAACATCTTT 
TTT6GGGATG 
TTTCTCCTTC 
CCCTGTTCAG 
AGGACTTTCT 
CGGAGAAGGC 
GAACTGCAGC 
□GAGACGGCC 
GTGCOGOGAG 
GCGG06G6AC 
CAAGGAGTTC 
TGCTOGCATC 
GGCTGATGTG 
GCTGCAGCTG 
GGOGGCCATG 
GCCCACCACA 
GAGCACCGGC 
TGGCTGCGAG 
GCGCTGCCAC 
CX^TGGACCAG 
CC06CTTATT 
TTTTCCCCAA 
GGTTTATTAT 
TATTTATTTT 
AAGGGGGAAA 
TAAAGAAATA 
TGAAAGAGGG 
ATTCTCTGGT 
TCCCCA6CAG 
GACAAGAATG 
ATTCCATTTG 
GAAAGTCCCC 
TTATAAGAAT 
TTTTGTTTGG 



TTG(?rTA6TA 
CXAAGCAACA 
GGAAAAACAG 
GTGATGTGAT 
GACAGGAAAT 
GTTTCCTAAA 
GCTAATGGAA 
TCCACTCATG 
GAACAGTTGC 
TTTGTACATA 
TTGCAGCCAG 
GTTGTCCTTC 
GGCTGTGGCC 
GAAGCTTGAG 
CTGCCTCACC 
TGTTCGTTAT 
CTCTTTTCTT 
ATTAATTGAG 
TnTTTTTAA 
TTA0C5TTGTT 
G6GCTTTGAT 
TCAGGCAACT 
TGAATTGTGA 
TTTGGTTTTT 
CCTTTTTAGT 
AAACTATTTA 
TTATTCTGTA 
TAAAAAACAA 
ACTTGTAAAA 



41 
I 

TTTOCCCACA 
TTCCCA6CEG 
TCCCTCCOGC 
AGCCTCTGGC 
TCTGGCTCCA 
TTCCTGAGTG 
GTGCGGGACT 
CAATTGAGAC 
GATTTCCTGG 
CCATCAAAAA 
GATGCTCCCC 
TCCTTCTTCC 
6CTGGAAGTG 
GGCCAGGTTG 
ATGTCAGAAG 
CAAGGACAGA 
GCXSAAGACAG 
ACTGTGGATA 
TTCACATACG 
GGCGAGCTGT 
TGGCTCTGGG 
GTGGAOGCCC 

ctcatgaacc 
gcctgcaagt 
gcacacttcc 
cggctcaaca 
caagacctgg 
tcgctgggca 
ctcatgtgct 
tgcaagttcc 
tttgtgtgca 
tatagaaagt 
gaattgcaac 
taatat tata 
gtggatcttt 
taacacatac 
cattttcttt 
tggtagaaat 
gcaagataaa 
ggctgctagc 
tcatattctc 
cagacagacc 

AGAAATTAAA 
TGGGATTCCA 
GGAGGTTGGC 
AATTATAATA 
AGGTAATTGC 
TGAAATCCAC 
GCTG6CCAC6 
GAAACATTAG 
ACTTTTATTT 
TTCACAGAGG 
CTTCrOCTAT 
ATTTATAAGG 
ACATATATAT 
TGATTTAGAT 
ACTGCAGTCC 
CCGCTGTGAT 
TTCTGACTCA 
CCTTTGTCTC 
AGACATGGAC 
AGTTCATTCT 
TCCCTAAGGA 
TAAGGACACC 
TTAAAAGTTT 
ATCACC TCAG 
GTGGCTCTTT 
GCAAAAGATC 
ATTATACAAA 
GACTCATGTT 
ATGTAAAATA 
CTTTTAATGT 
ACATCGAAAG 
AAAAAAAA 



51 
I 

TOGGCTOGTA 
GGCACTCTOG 
GTCCTGCCCC 
GGCAGCGCGC 
CTTGTTGCTC 
AATTACCCAG 
CGAGOGAGCA 
AOGTTTCTAA 
TGAGGTTGCG 
ACTCAOGGAG 
TOGTTTAACT 
CTCTCCAGAA 
CAATGTCTTC 
TAATTGAAGC 
TATATATTAT 
AGAAACTGTG 
GCATCAAAGA 
ACACCTCTGT 
CCGTGAGCCC 
CCACCTGCGG 
GCGGCT60GG 
GOGAGGGGGA 
TGCACAACAA 
GCCATGGGGT 
GCAAGGTGGG 
GCCGGGGCAA 
TCTACATCGA 
OGCAGGGCCG 
GCGGCCGTGG 
ACTGGTGCTG 
AGTAGTGGGT 
ACAGTGATTC 
C3GGAACCATT 
ATTATTATTT 
GAAAAGGTAA 
CCTAACTTAG 
TTCTCAAATA 
CTATTCACAA 
AGGTCTTGGG 
TTGCTTTCTG 
AAGGAAAAAA 
GTCATATTCT 
AAATTTAAAA 
GATTTGTAAA 
TTGAACATAA 
GTAGAAATAA 
GTGCCATTCA 
CTTCCTCTTC 
TTTCCAAAOG 
GAGCTCTGCT 
TGAGGAGCAG 
TGTTGCAGCG 
TGTACTGCAG 
GGGGAAATGT 
ATATATACAT 
TTACAGCTTA 
AGTTGGGATT 
CATACCCTGA 
CTGAAATGCG 
CAACCTCCAT 
GTTAAGAGAT 
GCAGAATG6A 
ATATTCAGCC 
TCTTTCCAAA 
GGAAAGATAC 
CCAACTGTGG 
AATTTATTGC 
TT6AAAGCAA 
AACCATGAAG 
TATGAAGAGA 
TTCTACATGT 
ACATATTTCT 
GCTTATTCCA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
TBO 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 



Seq ID NO: 249 Protein sequence: 
Protein Acces&ion ft: NP_003383 



11 
I 



21 
I 



41 
I 



51 

! 
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WO 02/086443 

MAGSAIISSRP FLVALAIFFS FAQWIEAKS HHSLGMSNPV 
SQGQKKLCHL YQDHMOYIGE GAKT6IKE0Q YQFHHRRWNC 
AFTYAVSAAG WNAMSRACR EGELSTOGCS RAARPRDLPR 
FVDARERERI HAKGSYESAR lUetLHNNEA GHRTVYJILAD 
LADFRKVC3JA LKHKVDSAAA MRLNSSGKLV Q VKSRPNSP T 
GSLGTQGHLC NKTSBtaiDGC BmOOGRGYD QFICTVQTERC 
QFVCK 

Seq ID NO: 250 DKA sequence 
Nucleic Acid Accession 8: KM_014058 
coding sequexice : 56 . . 1324 



PCT/US02/12476 



QMSEVYIIGA 
STVDKTSVFG 
OWLKGGQGDST 
VACKCHGVSG 
TQDLVyiDPS 
HCKFEHCCIfV 



I 
I 

TGACTTGGAT 
TCGGCCAGAT 
OGTCATCTTC 
GAGATATAAT 
ACTATATGCT 
TGAATCAATG 
TCAGGTTATC 
TAGATTTCAC 
TGAAAAGCTG 
AAAAATCAAC 
TAAAACTCTA 
GC0CT6GCA6 
TGCCACATGG 
GACTGCTTCC 
AATTGTCCAT 
TTCTAGOCCT 
TGAGTTTCRA 
TTACAGTCAA 
TGAACCTCAA 
AGGAAAAACA 
AGATATCTGG 
GCCTGGTGTT 
CTAAGAGAGA 
CCATTTTTAG 
AATAAACTGT 



11 
I 

GTAGACCTCG 
GTGGTGAGGG 
ATATCCCTGA 
CAAAAGAAGA 
GAGTTTGGCA 
GTGAAAAATG 
AAGTTCAGTC 
TCTACTGAGG 
CAAGATGCTG 
AAGACAQAAA 
G6TCAGAGTC 
GCTAGCCTGC 
CTTGTGAGTG 
TTTGGAGTAA 
GAAAAATACA 
GTTCCCTACA 
CCAGGTGATG 
AATCATCTTC 
GCTTACAATG 
GATGCATGCC 
TACCTTGCTG 
TATACTAGAG 
AAAGCCTCAT 
AGATACAGAA 
TTGCTTGATG 



21 
I 

ACCTTCACAG 
CTAGGAAAAG 
TTGTCCTGGC 
CCTACAATTA 
GAGAGGCTTC 
CATTTTATAA 
AACAGAAGCA 
ATCCTGAAAC 
TAGGACCCCC 
CAGACAGCTA 
TCAGGATCGT 
AGTGGGATGG 
CTGCTCACTG 
CAATAAAACC 
AACACCCATC 
CAAATGCAGT 
TGATGTTTGT 
GACAAGCACA 
ACX3CCATAAC 
AGGGTGACTC 
GAATAGTGAG 
TTACGCCCTT 
GGAACAGATA 
TTGGAGAAGA 
GAAAAAAAAA 



31 
I 

GACTCTTCAT 
AGTTTGTTGG 
AGTGTGCATT 
CTATAGCACA 
TAACAATTTT 
ATCTCCATTA 
TGGAGTGTTG 
TGTAGATAAA 
TAAAGTAGAT 
TCTAAACOVT 
TGGTGGGACA 
GAGTCATGGC 
TTTTACAACA 
TTOGAAAATG 
ACATGACTAT 
ACATAGAGTT 
GACAGGATTT 
GGTGACTCTC 
TCCTAGAATG 
T6GAGGACCA 
CIGGGGASAT 
GCGGGACTGG 
ACATTTTTTT 
CTTGCAAAAC 
A 



41 
I 

TGCTGGTTGG 
GAACCCTGGG 
GGACTCACTG 
TTGTCATTTA 
ACAGAAATGA 
AGGGAAGAAT 
GCTCATA1GC 
ATTGTTCAAC 
CCTCACTCAG 
TGCTGCGGAA 
GAAGTAGAAG 
TCTGGAGCAA 
TATAAGAACC 
AAACGGGGTC 
GATATTTCTC 
TGTCTCCCTG 
GGAGCACTGA 
ATAGAOGCTA 
TTATGTGCTG 
CTGGTTAGTT 
GAAT6TGQGA 
ATTACTTCAA 
TrGTTTTTTG 
AGCTAGATTT 



QPLCSQLAGIi 
HVKQI6SRET 
IDYGYRFAKE 
SCSUCTCNIO 
FDYCVRHEST 
KCKKCTEIVD 



SI 



CAATGATGTA 
TTATOGGGCT 
TTCATTATGT 
CAACTGACAA 
GCCAGAGACT 
TTGTCAAGTC 
TCTTGATTTG 
TTGTTTTACA 
TTAAAATTAA 
CACGAAGAAG 
AGGCTGAATG 
OCTTAATTAA 
CTGCCAGATG 
TtXGGAGAAT 
TTGCAGAGCT 
ATGCATCCTA 
AAAATGATGG 
CAACTTGCAA 
GCTCCTTAGA 
CAGATGCTAG 

aacxx:aacaa 
aaactggtat 
ggtgtggagg 

GACTGATCTC 



Seq ID NO; 2S1 Protein sequence: 
Protein Accession ft: NP_0547-77 



1 
I 

MYRPDWRAR 
DKLYAEFGRE 
ICRFHSTEDP 
RSKTLGQSLR 
RWTASFGVTI 
SYEFQPGDVM 
LBGKTDAOQG 
GI 



11 
I 

KRVCWEPWVI 
ASNNFTEMSQ 
ETVDKIVQLV 
rVGGTEVEEG 
KPSKMKRGIiR 
FVTGFGALKN 
OSGGFLVSSD 



21 
I 

GLVXFISLIV 
RLESMVKNAP 
LHEKLQDAVG 
EWPWQASLQW 
RIIVHEKYKH 
DGYSQNKLRQ 
ARDIWYLAGI 



31 
I 

LAVCIGIiTVH 
YKSPUIEEFV 
PPKVDPHSVK 
DGSHROGATL 
PSHDYDISIiA 
AQVTLIDATT 
VSWCaSECAKP 



41 
I 

YVRYIlQKlCrY 
XSQVIKFSQQ 
IXKINKTCTD 
INATWLVSAA 
ELSSPVPYTN 
OJEPQAYNDA 
NKPGVYTHVT 



SI 
1 

NYYSTLSFTT 
KHGVLAHMIiL 
SYLNHCCXSTR 
HCFTTYKNPA 
AVHRVdiPDA 
ITPRMLCAGS 
ALRDHITSRT 



Seq ID NO: 252 DNA sequence 

Nucleic Acid Accession ft: NM_003S04.2 

Ooding sequence: 71-1771 



GGCACGAGGC 
CGCCGTGGCT 
GAGGGTCCTT 
GGCCTTGTTC 
ACTTGAAACT 
TGGAGCTAAT 
GTGTGACACC 
ACTCATTAAA 
AGAGGAG6AT 
CACAOGGTTA 
GGAGGCCCGG 
GTCAGCCATG 
GTGGTGGGCC 
ATACGTGACT 
GGATGAGGAG 
CCTGGTGCTC 
AGCCAGGTTC 
CATGGGTCTT 
GGAGAATTTG 
CGTGCAGACT 
CTTTGCCACC 
CATCCAGGCT 
ACTCGCCAAG 
CCTCGTCATC 
CATGCTGTTC 



11 
I 

CTCGTGCCGC 
ATGTTCGTGT 
CTCTTCGTGG 
CAGTGTGACC 
GCATTTCTTG 
GTAGACCTAT 
CATAGGCCAG 
CAAGATGATG 
GAAGAGCATT 
GAAGAGGAGA 
AGAAGASACA 
GTGATGTTTG 
ATCGTTGGAC 
GATGTTGGTG 
AACACACTCT 
TACCAGCACT 
AAGCTGTGGT 
CCCCTGAAGC 
CGGGAAATGA 
TTCACCATTC 
ATGTCTTTGA 
CTGGACAGCC 
AAGCAGCTGC 
TCCCAGGGGC 
TCTAGGCOGG 
ACAAAGAACC 



21 

1 

CGGGCTCTTG 
CCGATTTCCG 
CCTCGGACGT 
ACGTGCAATA 
AGCATAAAGA 
TGGATATTCT 
TCAATGTCGT 
ACCTTGAAGT 
CAGGAAATGA 
TAGTGGAGCA 
TCCTCTTTGA 
AGCTGGCTTG 
TAACAGACCA 
TCXTTGCAGOG 
COGTGGACTG 
GGTCCCTCCA 
CTGTGCATGG 
AGGTGAAGCA 
TTGAAGAGTC 
ATTTTGGGTT 
TGGAGAGCCC 
TCTCCAGGAG 
GAGCCACCCA 
CTTTCCTGTA 
CATCCCTAAG 
GG06CTGCAA 



31 
I 

GTACCTCAGC 
CAAAGAGTTC 
GGATGCTCTG 
TACGCTGGTT 
ACAGTTTCAT 
TCAACCTGAT 
CAATGTATAC 
TCCCGCCTAT 
CAGTGATGGG 
AACCATGCGG 
CTACGAGCAG 
GATGCTGTCC 
GTGGGTGCAA 
CCAOGTTTCC 
CACACGGATC 
TGACAGCCTG 
ACAGAAGGG6 
GAAGTTCCAO 
TGCAAATAAA 
CAAGCACAAG 
CGAGAAGGAT 
TAACCTGGAC 
GCAGACCATT 
CTGCTCTCTC 
GCT6CTCAGC 
ACTGCTGCOC 



41 
I 

GCGAGCGCCA 
TACX3AGGTGG 
TGTGG6TGCA 
CCAGTTTCTG 
TATTTTATTC 
GAAGACACTA 
AACGATACCC 
GAAGACATCT 
TCAGAGCCTT 
AGGAGGCAGC 
TATGAATATC 
AAGGACXrrGA 
GACAAGATCA 
CGCCACAACC 
TCCTTTGAGT 
TGCAACACCA 
CTCCAGGAGT 
GCCATGGACA 
TTTGGGATGA 
TTTCTGGCCA 
GGCTCAGGGA 
AAGCTGTACC 
GCCAGCTGCC 
ATGGAGGGCA 
AAACAGCTGC 
CTGGTGAIGG 



60 
120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 

iceo 

1140 
1200 
1260 

1320 
1360 
1440 



60 
120 
IBO 
240 
300 
360 
420 



51 
1 

GGCGTCCGGC 
TCCAGAGCCA 
AGATCCTTCA 
GGTGGCAAGA 
TCATAAACTG 
TATTCTTTGT 
AGATCAAATT 
TCAGGGATGA 
CTGAGAAGOG 
GGCGAGAGTG 
ATGGGACATC 
AT6ACATGCT 
CTCAAAT6AA 
ACCGGAACGA 
ATGACCTCOG 
GCTATACCGC 
TCCTTGCAGA 
TCtCCTTGAA 
AGGACAT6GG 
GOGACGTGGT 
CAGATCACTT 
ATGGCCTGGA 
TTTGCACCAA 
CTCCAGATGT 
TCAAGTCCTT 
CTGOOGCGCT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
ISOO 
1560 



284 



wo 02/086443 

CaCCATtSGAG CaTGGCACftC TGAOOGTGGT GGGCATCCCC CCAGACACOG ACAGCTOSGA 1620 

CACGAAGAAC TTTTTTGGGA GGGCffmCA GA^ 1«80 

GCTGCACAAC CATTTTGACC TCTCACTAAT TCAGCTGAAA GCTGAG6ATC GGA GCAAGTT 1740 

TCTGGACGCA CTTATTTCCC TCCTGTCCTA GGAATTTGAT TCTTCCaGAA TGACCTT^ 1800 

ATTTATCTAA CTGGCmCA TTTAGAITGT AAGTTATGGA CATGATTTGA GATGTAGAAG 1860 

OCATTTrrrA TTAAATAAAA TGCTTATTTT AGGCTC^ 1920 
AAAAAAAAAA AA 

Seg ID NO: 253 Protein sequence: 
Protein Accession 8: NP_003495.1 

1 11 21 Y f r 

ipySOFRKEP YEWQSQRVL LvASDVDAI. CACKIWJALP QCDHVQYTLV PVSGWQELBT 60 

aflSSqfh YFILINCGAN V0UJ)ILQPD edtiffvcdt HRPVNWNVY NDTQIKLLIK 120 

ODDDLEVPAY EDIFRDEEED EEHSGSDSDG SBPSEKRTHL BEBIVBQTKR BHQRHEHEAR 180 

RRDILFDYEQ YEYHGTSSAM VMFELAWMLS KDLKDMLHWA IVGLTDQWVQ DKIWKYVT 240 

DVGVLQRHVS RHNHRNEDEE NTLSVDCTRI SPBYDLRLVL YQHHSIiJDSL CNTSYTAARF 300 

KLWSVHGQKR LQEFLADMGL PLKQVKQKFQ AMDISLKBNL REMIEESANK FGJ^DMRVQT 360 

PSIHFGFKHK PLASDWFAT MSLMBSPEKD GSGTDHFIQA U3SLSRSNLD KLYHGLELAK 420 

KQLRATQQTI ASCLCTNLVI SQGPFLYCSL KEGTPDVW.P SRPASLSLl^ ^^^JJ??!!^™ HI 

TKKRRCKLIjP LVMAAPbSKE HGTVTWGIP PETDSSDRKK FPGRRFEKAA ESTSSRMLHN 540 
HFDLSVIELK AEDRSKPLDA LISLLS 

Seq ID KO: 254 DSA sequence 
Nucleic Acid Accession fti NM_022337 
Coding sequences 48.. 68 3 



1 11 21 31 41 51 

GGCTGCGCTT CCCTGGTCAG GCACGGCACG TCTGGCOGGC CGCCAGGATG CAGGCCCOGC 60 

ACAAGGAGCA CCTCTACAAG TT6CTGGTGA TTGGOGACCT GGGOGTGGGG AAGACCAGTA 120 

TCATCAAOCG CTACCyTGCAC CA6AACTTCT CCTCGCACTA CQQQGOCACA ATCGGCGTGG ISO 

ACTTCXKX3CT CAAGGTGCTC CACTGGGACC CGGAGACTGT GGTGOGCCTG CAGCTCTGGG 240 

ATATOGCAGG TCAAGAAAGA TTTGGAAACA TGAOGAGGGT CTATTACCGA GAAGCTATGG 300 

GTGCATTTAT TGTCTTCGAT GTCACCAGGC CAGCCACATT TGAAGCAGTG GCA AAgTGGA 360 

AAAATGATTT GGACTCCAAG TTAAGTCTCC CTAATGGCAA ACCGGTTTCA GTGGTTTTGT 420 

TGGOCAACAA ATGTCSACCAG GGC5AAGGATG TGCTCATGAA CAATGGCCTC AAGATGGACC 480 

AGTTCTGCAA GGA6CA0GGT TTCGTAGGAT GGTTTGAAAC ATCAGCAAAG GAAAATATAA 540 

ACATTGATGA AGCCTCCAGA TGCCTGGTGA AACACATACT TGCAAATGAG TGTGAOCTAA 600 

TCGAGTCTAT TGAGCCGGAC GTCGTGAAGC COCATCTCAC ATCAACCAAG GTTGCCAGCT 660 

GCTCTGGCTG TGCC3\AATCC TAGTAGGCAC CnTGCTGGT GTCTGGTAGG AATGACCTCA 720 

TTGrrcCACA AATTGTGCCT CTATTTTTAC CATTTTGGGT AAACGTCAGG ATAGATATAC 780 

CACATGTCGC AAGCCAAAGA TCTATGCCTC TGTTTTTTCA ATGAGAGAGA AATAGCAAAT 840 

GTTCTTTCTA TGCTTTCCTC ACCATCATCA CAGTGTTTAC AAACTTTTGA AAATATTTAG 9O0 

TCTGTTACAA ACTTCTGTCA TCTAGCTGAC CAAAATCCTG CAGGGCCACA GTCGGCACTG 960 

TTATTTQCTT CTTTTAATCA GCAAACGCCT CAAGTCTTAA AATAAAAGGG GAGAAGAACA 1020 

AACTAGCTGT CAAGTCAAGG ACTGGCTTTC ACCTTGCCCT GGTGTCTTTT TCCAGA TTTC 1080 

AATATATTCT CTGATGGCCT GACAGGCCTA TTAAGTAGAT GTGATATTTT CTTOCRAGAT 1140 

GACCTCCATT CTOGGCAGAC CTAAGAGTTG CCTCTGAGTT AGCTCTTTGG AATOGTGAAC 1200 

ACAGGTGTGC TATATTGTCC TTGTCCTAAC TGTCACrTGC CATGGCCTGA ATGTTGGCTT 1260 

AACTGAATAT TGTATGAAAA GACATGCCTC CATATGT6CC TTTCTGTTAG CTCT CTTTG A 1320 

CTCAAGCTGT CGGGCTCCTC TATACAT6GT ATACATGTAA TATATATTAT ATATATTTTT 1380 
GCAAGTGAAC AATAAAACAT TAAAAGATAA AA 

Seq ID KO: 255 Protein sequence: 
Protein Accession Ut NP_071732 



1 11 21 

i I I 

MQAPHKEHLY KLLVIGDLGV GKTSIIKRYV 
LQItWDIAGQE RPCTMTRVYY REAMGA FIVF 
SWLLAKKCD QGKDVLMNNG LRMDQFCKEH 
ECDLMESIEP DWKPHLTST KVASCSGCAK 



31 41 - 51 

1 I I 

HQNFSSHYRA TIGVDFALKV LHWDPETWR 60 
DVTRPATPEA VAKWKNDLDS KLSLPKGKPV 120 
GFVGSJFETSA KENINIDEAS RCLVKHILAK 180 
S 



Seq ID KO: 256 D17A sequence 
Nucleic Acid Accession #; NM__016321 
Coding sequence: 25.. 1464 

1 11 21 31 41 51 

G6AACCGCCC GCTGCCAGCC OGGCCAGGCA CCCCTGCAGC ATGGCCTGGA ACACCAACCT 60 

CCGCTGGCX3G CTGCCGCTCA CCTGCCTGCT CCTGCAGGTG ATTATGGTGA TTCTCTTOGG 120 

GGTGTTCGTG OGCTACGACT TOGAGGCCGA CGCCCACTGG TGGTCAGAGA GGACGCACAA 180 

GAACTTGAGC GACATGGAGA ACGAATTCTA CTATCGCTAC CGAAGCTTCC AGGACGTGCA 240 

CGTOATGGTC TTCGTGGGCT TCX5GCTTCCT CATGACTTTC CTGCAGCGCT AOGGCTTCAG 300 

CGCCGTGGGC TTCAACTTCC TGTTGGCAGC CTTCGGCATC CAGTGGGCGC TGCTCATGCA 360 

GGGCTGGTTC CACTTCTTAC AAGACCGCIA CATCGTCGTG GGCGTGGAGA ACCTCATCAA 420 

CGCTGACTTC TGCGTGGCCT CTGTCTGOGT GGCCTTTGQG GCAGTTCTGG GTAAAGTCAC 480 

CCCCATTCAG CTGCTCATCA T6ACTTTCTT CCAAGTGACC CTCTTCGCTG TGAATGAGTT 540 

CATTCTCCTT AACCTGCTAA AGGTGAAGGA TGCAGGAGGC TCCATGACCA TCCACACATT 600 

TOGCGCCTAC TTTGGGCTCA CAGTGACCCG GATCCTCTAC CGACX^CAACC TAGAGCAGAG 660 

CAAGGAGAGA CAGAATTCTG TGTACCAGTC GGACCTCTTT GCCATGATTG GCACCCTCTT 720 

CCTOTGGATC TACTOGCCCA GCTTCAACTC AGCCATATCC TACCATGGGG ACAOCCAGCA 780 

COSAGOOGCC ATCAACACCT ACTGCTCCTT GGCAGCCTGC GTGCTTACCT CGGTGGCAAT 840 



285 



5 
10 
15 
20 
25 
30 
35 



WO 02/086443 

ATCCAGIGCC CTGCRCRftGA AGGGCAAGCT GGACATGGTG CACATC CAGA ATGCCACOT 
OGCAGGAGGG CTGGCOGTGG GTACCGCTGC TGACATGATG CTCATGCCTT AOGGtGCCCT 
CATCATOGGC TTCGTCTGOS GCATCATCTC CAOOCTOGCT TnG TASAO C TGACaXaTT 
CCTGGAGTCC OGGCTGCACA TCCAGGACAC ATGIGGCATT AACAATCTGC ATGGCATTCC 
TGGCATCATA GGCGGCATCG TGGGTGCTGT GACAGOCSGCC TCCGCCaCCC TTGAAGTCtA 
TGGAAAAGAA GGGC11X 3 TCC ATTCCTTTGA CTTTCAAGGT TTCAAOSGGG ACTGGACOGC 
AAGAACACAG GGAAAGTTCC AGATTTATGG TCTCrTGGTG ACCCTGGCCA TGGCCCTGAT 
•GGOTGGCATC ATTGTGGGGC TCATTTTGAG ATTACCATTC TOGGGACAAC CTTCAGATCA 
GAACTCCTTT GAGGATGCGG TCTACTGGGA GATGCCTGAA GGGAACAGCA CTGTCTACAT 
CCCTGACGAC CCCACCTTCA AGCCCTCAGG ACCCTCAGTA CCCTCACTAC CCATGGTGTC 
CCCACTACCC ATGGCrrCCT 0C3GTACCC1T GGTACCCTAG GCTCCCAGGG CAGGTCAGGA 
GCACSGCrCCA CAGACTSTCC TGGGGCCCAG AGGAGCTGGT GCTGACCTAG CTAGGGATGC 
AAGAGTGAGC AAGCAGCACC CCCACCTGCT GGCTTGGCCT CAAGGTGCCT CCACCCCTGC 
CCTCCCCTTC ATCCCAGGGG GTCTGMCTGA GAATGGAGAA GGAGAAGCTA CAAAGTGGGC 
ATCCAAGCCG GGTTCTGGCT GCAGAAGTTC TGCCTCTGCC TGGGGTCTTG GCCACATTGG 
AGAAAAACAG GCTCAAAGTG GGGCTQGGAC CTCGTGGGTG A ACCTGAOCT CTCCCAGGAG 
ACAACTTAGC TGCCAGTCAC CACCTATGAG GCTCTTCTAC COCgTGOCTG CACCTOGGCC 
AGCATCTCCT ATCCTCCCTG GGTCOCCCAG ACCTCTCTGT GTTGTGTGOG TGGCAGCCTC 
CAGGAATAAA CATTCrTGTT GTCCTTTOrA AAAAAAAAAA AAAAAAAA 

Seq ID NO: 257 Protein sequence t 
Protein Accession «: NP^057405 



PCT/US02/12476 



900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1S6Q 
1620 
1680 
1740 

leoo 

1860 
1920 



MAHfilTNLRHR 
PSFQDVHVMV 

GVENLINADF 
SMTIHTFGAy 
YHGDSQERAA 
IiMPYGALIIG 
SASLEVYGKE 
WGQPSDENCF 



11 
1 

LPliTCLLLQV 
FVGFGFLMTF 
CVASVCVAPG 
FGLTVTRILY 
INTYCSLAAC 
FVOGIISTLG 
GLVHSPDFQG 
EDAVYHGMPE 



21 
I 

XMVILFGVFV 
IjQRYGFSAVG 
AVIiGKVSPIQ 
RRNIiBQSXER 
VLTSVAISSA 
FVYLTPPLBS 
FNGDWTARTQ 
GKSTVYIPBD 



31 
I 

RYDFEADAHW 
FUFliLAAFGI 
LLIMTFPQVT 
QNSWQSDLP 
LKKKGKLDMV 
RliHIQDTCXSI 
GKFQIYGLLV 
PTFKPSGPSV 



41 
I 

WSERTHKNLS 
QWALLMQGWF 
LFAVNEFILL 
AMIGTLFLUM 
HIQNATIiAGG 
NNIiHGIPGII 
TLAMALKGGl 
PSVPMVSPLP 



51 

1 

DMENBFYYRY 
HFLQDRYIW 
NLLKVKDAGG 
YWPSFNSAZS 
VAVGTAAEMM 
GGIVGAVTAA 
rVGLILRLPF 
MASSVPLVP 



60 
120 
180 
240 
300 
360 
420 



Seq ID NOt 258 DKA sequence 

Nucleic Acid Accession #: NM_002358.2 

Coding sequence: 75.. 692 



40 
45 
50 
55 
60 
65 



GGGAAGTGCT 
TTGTGTCCXn' 
GCGCCGAAAT 
GCATATATCC 
CTACTGATCT 
TATACAAGTG 
TCCTGGAAAG 
CCAGAGAAAA 
CAGCTACGGT 
ATACAGACAA 
CXSVATTCTGA 
TGGTGGCCTA 
TAATTTTGAA 
TTGGTTAATT 
CCTTTATTTT 
CATTGTTCAA 
GATAGTAACT 
GTTTTGGTCA 
AAAGGAAGTC 
AACAATGAAA 
TTGAATCAGT 
ATATTTGTAC 
TTATAAAATC 
AAAAAAAAAA 



11 

I 

GTTGGAGCCG 
GGCCATGGOG 
CX3TGGCCGAG 
ATCTGAAACX: 
TGAGCTCATA 
TTCAGTTCAG 
ATGGCAGTTT 
GTCTCAGAAA 
GACATTTCTG 
AGATTTGGTT 
GGAAGTCOGC 
CAAAATTCCT 
ATGTGGTTTT 
TTTACATGGA 
TTTGGTACCT 
AAGGAACCAG 
GTAGATGGAA 
AGTAGTTTGA 
TAAATATTCA 
TATTGCTGTA 
TTCCAATTAT 
TGTTTAATGT 
AAGTTTTAAG 



21 
I 

CTGTGGTTGC 
CTGCAGCTCT 
TTCTTCTCAT 
TTTACTCGAG 
AAATACCTAA 
AAACTGGTTG 
GATATTGAGT 
GCTATCCAGG 
CCACTGTTGG 
GTACCTGAAA 
CTTCGTTCAT 
GTCAATGACT 
CCTGAAATCA 
GAAAACCAAA 
ATTTGACTTA 
GAGGTTrTTT 
AAACTTGTGC 
CTCAGTATAG 
6AATCTTTGT 
TAGCTCCTTT 
TTGACTTTAA 
TCTGTGATAC 
TGAAA6TGAG 



31 
I 

TGtCOGOQGA 
GCOGGGAGCA 
TOGGCATCAA 
TGCAGAAATA 
ATAATGTGGT 
TAGTTATCTC 
6TGACAAGAC 
ATGAAATCOG 
AAGTTTCTTG 
AATGGGAAGA 
TTACTACTAC 
GAGGATGACA 
GGTCATCTAT 
ATGATACTTA 
CCATGGAGTT 
TGTCAACATT 
TATAAAGCTA 
GTAGGGAGAT 
TAAGGTCCTG 
TGACCTTCAT 
TTTATGTAAC 
AGAACTCTTA 
GAAATAAAGT 



41 

I 

GTGGAAGCGC 
GGGAATCACC 
CAGCATTTTA 
CX3GACTCACC 
GGAACAACTG 
AAATATTGAA 
OXSCAAAAGAT 
TTCAGTGATC 
TTCATTTGAT 
GTCGGGACCA 
AATCCACAAA 
TGAGGAAAAT 
AGTTGATATG 
CTGAACTGTG 
AACATCATGA 
GTGATGTATA 
GATGCTTTCC 
ATTTAAGTAT 
AAAGTAACTC 
TTCATGTATA 
TTGAACCTAT 
AAAATGTTTT 
TAAGTTXGTT 



51 

I 

GTGCTTTTGT 

CTGGGGGGGA 

TATCAGOGTG 

TTGCTTGTAA 

AAAGATTGGT 

AGTGGTGAGG 

GACAGTGCAC 

AGACAGATCA 

JCTGCXGATTT 

CAGTTTATTA 

GTAAATAGCA 

AATGTAATTG 

TTTTATTTCA 

TGTAATTGTT ' 

ATTTATTGCA 

TTCCTTTGAA 

TAAATCAGAT 

AAAATACAAC 

ATAATCTATA 

GTTTTCCCTA 

GAAGCAATGG 

TTCATGTGTT 

TTAAAAAAAA 



60 

120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



70 
75 
80 
85 



Seq ID NO: 259 Protein sequence: 
Protein Accession D: NP_002349.1 

1 11 21 31 41 51 

MALQLSREQG ITLRGSABIV ABPFSPGINS ILYQRGIYPS ETPTRVQKYG LTIiVTTDLE 
LIKYIJINVVE QLKDWLYKCS VQKLVWISN lESGEVZ^W QFDIECDKTA KDDSAPREKS 
QKAIQOEIRS VIRQITATVT PLPLLEVSCS FDU.IYTDKD LWPEKWEES GPQFITWSEE 
VRLRSFTTTI HKVNSMVAYK IPVND 

Seq ID KO: 260 DNA sequence 
Nucleic Acid Accession #i NM_001211 
Coding sequence: 43.. 3195 



11 



21 



31 
I 



51 



AAAGGCCTCC AGCAGGACGA GGACCTGAGC CAGGAATGCA GGATGGCGGC GGTGAAGAAG 
GAAGGGGGTG CTCTGAGTGA AGCCATGTCC CTGGAGGGAG ATGAATGGGA ACTGAGTAAA 
GAAAATGTAC AACCTTTAAG GCAAGGGCQG ATCATGTCCA CGCTTCAGGG AGCACTGGCA 
CAAGAATCTC CCTCTAACAA TACTCTTCAG CAGCAGAAAC GGGCATTTGA ATATGAAATT 



60 
120 
180 



60 
120 
180 
240 



286 



wo 02/086443 

OGATTTTACA CTGGAAATGA COCTCTGGAT GTTTGOaTA GGTATATCAG CTGGA CAGAG 300 

CAGAACTATC CTCAAGGTGG GAAAGAGACT AATATCICftA O STTATTAG A AAGAGCTGTA 360 

GAACCACTAC AAGGAGAAAA AOGATATTAT AGTGATOCTC GATTTCTCAA TCICTGGCTT 420 

AAATTACGGC GTTTATGCAA TGAGCCTTTG GATATGTACA GTTACTTGCA CAACCAAGGG 480 

ATTGGTCTTT CACTTGCTCA GTTCTATATC TCATGGGCAG AAGAATATGA AGCTAGAGAA 540 

AACTTTAGGA AACCAGATGC GATATTTCAG GAAGGGATTC AACAGAACGC TGAACC3VCTA 600 

GAAAGACTAC AGTCCCftGCA CCGACAATTC CAAGCTCGAG TGTCTGGGCA AACTCTGTTG 660 

GCACTTGAGA AAGAAGAAGA GGAGGAAGTT TTTGAGTCTT CTGIACCACA AOGAAG CACA 720 

CTAGCTGAAC TAAAGAGCAA AGGGAAAAAG ACAGCAAGAG CTCCAATCAT COGTGTAGGA 780 

GGTGCrCTCA AGGCTOCAAG CCAGAACAGA GGACTCCAAA ATCCATTTCC TCAACAGATG 840 

CAAAATAATA GTAGAATTAC TGTTTTTGAT GAAAATGCTG ATGAGGCTTC TACAGCAGAG 900 

TTGTCTAAGC CTACAGTCCA GCCATGGATA GCACCCCCCA TGCCCAGGGC CAAAGAGAAT 960 

GAGCreCAAG CRGGCCCPrG GAACACAGGC AGGTOCTTGG AACACAGGCC TCGTGGCAAT 1020 

ACAGCrrCAC TCATAGCTGT ACCCGCTGTG CTPCOCAGTT TCACTCCATA TGTGGAAGAG 1080 

ACTGCACAAC AGCCAGTTAT GACACCATGT AAAATTGAAC CTAGTATAAA CCACATCCTA 1140 

AGCACCAGAA ACCCTGGAAA GGAAGAAGGA GATCCTCTAC AAAGGGTTCA GAGCCATCAG 1200 

CAAGCGTCro AGGACAA6AA AGAGAAGATG ATGTATTGTA W3GAGAAGAT TTATGCAGGA 1260 

GTAGGGGAAT TCTGCrTTGA AGAAATTCGG GCTGAAGTTT TC0C5GAAGAA ATTAAAAGAG 1320 

CAAAGGGAAG CCGAGCTATT GACCAGTGCA GAGAAGAGAG CAGAAATGCA GAAACA6ATT 1380 

GAAGAGATGG AGAAGAAGCT AAAAGAAATC CAAACTACTC AGCAAGAAAG AACAGGTGAT 1440 

CAGCAAGAAG AGACGATGOC TACAAAGGAG ACAACTAAAC TGCAAATTGC TTCCGAfiTCT ISOO 

CAGAAAATAC CAGGAATGAC TCTATCCAGT TCTGTTTGTC AAGTAAACTG TTGTGCCAGA 1560 

GAAACTTCAC TTGOGGAGAA CATPrGGCAG GAACAACCTC ATTCTAAAGG TCCCAGTGTA 1620 

CCTTTCTCCA TTrTTGATGA G ' m 'C'riC'i'i' TCAGAAAAGA AGAATAAAAG TCCTCCTGCA 1680 

GATCCCCCAC GAGTTTTAGC TCAACGAAGA CCCCTTGCAG TTCTCAAAAC CTCAGAAAGC 1740 

ATCACCTCAA ATCAAGATGT 6TCTCCAGAT GTTTGTGATG AATTTACAGG AATTGAACCC 1800 

TTGAGCGAGG ATGCCATTAT CACAGGCTTC A6AAATGTAA CAA TTTGT CC TAACCCAGAA 1860 

GACACTTCTG ACTTTGCCAG AGCAGCTCGT TTTGTATCCA CTCCTTTTCA TGAGATAATG 1920 

TCCTTGAAGG ATCTCCXTTTC TGATCCTGAG AGACTGTTAC CGGAAGAAGA TCTAGATGTA 1980 

AAGACCTCTG AGGACCAGCA GACAGCTTGT GGCACTATCT ACAGTCAGAC TCTCAGCATC 2040 

AAGAAGCTGA GCCCAATTAT TGAAGACAGT CGTGAAGCCA CACACTCCTC TGGCTTCTCT 2100 

GGTTCTTCTG CCTCGGrTGC AAGCACCTCC TCCATCAAAT GTCTTCAAAT TCCTGAGAAA 2160 

CTAGAACTTA CTAATGAGAC TTCAGAAAAC CCTACTCAGT CA0CATGGT6 TTCACAGTAT 2220 

CGCAGACAGC TACTGAAGTC CCTACCAGRG TTAAGTGCCT CTGCAGAGTT GTGTATAGAA 2280 

GACAGACCAA TGCCTAAGTT GGAAATTGAG AAGGAAATTG AATTAGGTAA TCAGGATTAC 2340 

TGCATTAAAC GAGAATACCT AATATGTGAA GATTACAAGT TATTCTGGGT GGOGCCAAGA 2400 

AACTCTGCAG AATTAACAGT AATAAAGGTA TCTTCTCAAC CTGTCCCATG GGACTTTTAT 2460 

ATCAACCTCA AGTTAAAGGA ACGTTTAAAT GAAGATTTTG ATCATTTTTG CAGCTGTTAT 2520 

CAATATCAAG ATGGCTGTAT TGTTTGGCAC CAATATATAA ACTGCTTCAC CCTTCAGGAT 2580 

CTTCTCCAAC ACAGT6AATA TATTACCCAT GAAATAACAG TGTTGATTAT TTATAACCTT 2640 

TTGACAATAG TGGAGATGCT ACACAAAGCA GAAATAGTCC ATGGTGACTT GAGTCCAAGG 2700 

TGTCTGATTC TCAGAAACAG AATCCACGAT CCCTATGATT GTAACAAGAA C AATCAAG CT 2760 

TTGAAGATAG TGGACTTTTC CTACAGTGTT GACXTTTAGGG TGCAGCTGGA TGTTTTTACC 2820 

CTCAGCGQCT TTCGGACTGT ACAGATCCTG GAAGGACAAA AGATCCTGGC TAACTGTTCT 2880 

TCTOCCTACC ACGTAGACCT GTTTGGTATA GCAGATTTAG CACATTTACT ATTGTTCAAG 2940 

GAACACCTAC AGGTCTTCTG GGATGGGTCC TTCTGGAAAC TTAGCCAAAA TATTTCTGAG 3000 

CTAAAAGATG GTGAATTGTG GAATAAATTC TTTGTGOQGA TTCTG AATGC CAATGATGAG 3060 

GCCACAGTGT CTGTTCTTGG GGAGCTTGCA GCAGAAATGA ATGGGGTTTT TGACACTACA 3120 

TTCCAAAGTC ACCTGAACAA AGCCTTATGG AAGGTAGGGA AGTTAACTAG TCCTGGGGCT 3180 

TTGCTCTTTC AGTGAGCTAG GCAATCAAGT CTCACAGATT GCTGCCTCAG AGCAATGGTT 3240 

GTATTGTGGA ACACTGAAAC TGTATGTGCT GTAATTTAAT TTACGACACA TT TAGATG CA 3300 

CTACCATTGC TGTTCTACTT TTTGGTACAG GTATATTTTO AOGTCACTGA TATTTTTTAT 33 60 

ACAGTCATAT ACTTACTCAT GGCCTTGTCT AACTTTTGTG AAGAACTATT TTATTCTAAA 3420 

CAGACrCATT ACAAATGGTT ACCTTGTTAT TTAACCCATT TGTCTCTACT TTTCCCTGTA 3480 

CTTTTCCCAT TTGTAATTTG TAAAATGTTC TCTTATGATC ACCATGTATT TTGTAAATAA 3540 
TAAAATAGTA TCTGTTAAAA AAAAAAAAAA AAAAAAAAAA AAA 

Seq ID NO; 261 Protein sequence: 
Protein Accession #: NP_001202 

1 11 21 31 41 51 

I I I I i I 

MAAVKKEGGA LSEAMSLBGD EWELSKENVQ PLRQGRIMST LQGALAQBSA CNNTLQQQKR 60 

AFEYEIRFYT GNDPLDVWDR YISWTBQNYP QGGKESNMST LLERAVEALQ GEKRYYSDPR 120 

PLNLWUKLGR LCMEPLOKYS YLHNQGIGVS LAQPYISHAE EYEARENPRK ADAIFQEGIQ 180 

QKAEPLERLQ SQHRQFQARV SRQTLIALEK EEEEEVFESS VPQRSTLAEL KSKGKKTARA 240 

PIIRVGGAUC APSQNRGLQN PFPQQMQNNS RITVFDENAD EASTAELSKP TVQPWIAPPM 300 

PRAKENELQA GPWNTGRSLE HRPRC2JTASIj IAVPAVLPSP TPYVEETAQQ PVMTPCKIEP 360 

SINHILSTRK PGKEEGDPLQ RVQSHQQASE EKKEKMMYCK EKIYAGVGEP SFEEIRAEVF 420 

RKKIiKEQREA ELLTSAEKRA EMQKQIEEME KKLKEIQTTQ QERTGDQQEE TMPTKETTKL 480 

QIASESQKIP GMTLSSSVCQ VNCCARETSL AENIWQEQPH SKGPSVPPSI FDEFLLSEKK 540 

NKSPPADPPR VLAQRRPIAV LKTSESITSN EDVSPDVCDE PTGIBPIiSED AUTGFRNVT 600 

ICPNPEDTCD FAHAARPVST PFHBIMSLKD LPSDPERUiP EBDLDVKTSB DQQTACGTIY 660 

SQTLSIKKLS PIIEDSREAT HSSGFSGSSA SVASTSSIKC LQIPEXLELT NETSEKPTQS 720 

PWCSQYRRQL LKSLPELSAS AELCIEDRPM PKLEIEKEIE LGNEDYCIKR EYLICEDYKL 780 

FWVAPRNSAE LTVIKVSSQP VPWDFYINLK LKERLNEDFD HFCSCYQYQD GCIVWHQYIN 840 

CFTLQDLLQH SEYITHEITV LIIYNIiLTIV EMLHKAEIVH GDLSPRCLIL RNRIHDPYDC 900 

MKNNQALKIV DPSYSVDIiRV QLOVPTLSGF RTVQILBGQK IIiANCSSPYQ VDLFGIADLA 960 

HLLLFKEHLQ VFHDGSFWKL SQNISELKDG EI.WNKPFVR1 LNAKDBATVS VLGELAAENN 1020 
GVFDTTPQSH I2JKALWKVGK LTSPGALLFQ 

Seq ID NO: 262 DMA sequence 
Nucleic Acid Accession S: NM_003784 
Coding sequence*. 365.. 1507 



t 11 21 31 41 51 



287 



wo 02/086443 
1 1 1 I i ) 

CTCTACTTAT CAATAAGCAG CIGCCTGTGC AGACTGCAGG CTGCACCTTT G GACftGCCTT 60 

TAAAACTCAA TTCTCAGAAT TTTAGAACAA ATTTTTGTCT AGAAATGCTG ACmGGTTC 120 

ATTAGGTAGT GGTAAAACAG GCTCCXTTTCG AAGCTCTCCT TCATCACCTT CCTAAGTGCA 180 

TGTAGAOGGA AGCTCTCCTT CATCACCTTC CTAAGTGCAT GGGGGAAAAT ACCTAGGGCT 240 

CAACAGTCTT GAGAACTGTG GAAACATTTT CTTTGTGAGT GAGAACAGAT CACCTAGACA 300 

AAGGAAACCA GATTCCCATC ACTGCTTCTG GGTATCAGAT GCTAGCGCTC CACTCCATTT 360 

TCCAATGGCC TCCCTTGCTG C»GCAAATGC AGAGTTTTGC TTCAACCIGT TCW5AGAGAT 420 

GGATCACAAT CAAGGAAATG GAAATGTGTT CTTTTCCTCT CTGAGCCTCT T OGCT GCCCT 480 

QGCCCTGGTC CGCTTCGGOG CTCAAGATGA CTCCCTCTCT CAGATTGATA AGTTGCTTCA 540 

TCTTAACACT GCCTCAGGAT ATGGAAACTC TTCTAATAGT CAGTCAGGGC TCCAGTCTCA 600 

ACTGAAAAGA O Trn T TCTO ATATAAATGC ATCCO^CSU^G GATTATGATC T CAGCA TTGT 660 

GAATGGGCTT TTTGCTGAAA AAGTGTATGG CTTTCATAAG GACTACATTG AGTGTGCCGA 720 

AAAATTATAC GATGCCAAAG TGGACCGAGT TGACTTTAOG AATCATTTAG AAGACACTAG 780 

ACGTAATATT AATAAGTGGG TKa^AAATCA AACACATCGC AAAATCAAGA A0GT6ATTGG 840 

TGAAGGTGGC ATAAGCTCAT CTGCTGTAAT GGTGCTGGTG AATGCTGTGT ACTTCAAAGG 900 

CAAGTCGCAA TCAGCCTTCA CCAAGAGCGA AACCATAAAT TGCCATTTCA AATCTCCCAA 960 

GTGCTCXGGG AAGGCAffTOG CCATGATGCA TCAGGAACGG AAGTTCAATT TGTCTGTTAT 1020 

TGAGGACCCA TCAATGAAGA TTCTTGAGCT CAGATACAAT GGTGGCATAA ACATGTACCT 1080 

TCTGCTGCCT GAGAATGACC TCTCTGAAAT T6AAAACAAA CIGAC CTTTC AGAATCTAAT 1140 

GGAATGGACC AATCCAAGGC GAATGACCTC TAAGTATGTT aWSGTATTTT TTOCTCAGTT 1200 

CAAGATAGAG AAGAATTATG AAAIGAAACA ATATTTGAGA GCCCTACGGC TGAAAGATAT 1260 

CTTTCATGAA TCCAAAGCAG ATCTCTCTGG GATTGCTTCG GGGGGTCGTC TGTATATATC 1320 

AAGGATGATG CACAAATCTT ACATAGAGGT CACTGAGGAG GGCACCGAGG CTACTGCTGC 1380 

CACAGGAAGT AATATTGTAG AAAAGCAACT CCCTCACTCC ACGCTGTrrA GAGCTGACCA 1440 

CCCATTCCTA TTTGTTATCA GGAAGGATGA CATCATCTTA TTCAGTCGCA AAGTTTCTTC ISOO 

CCCTTGAAAA TCCAATTCQT TTCTGTTATA GCACyTCCCCA CAACATCAAA GRACCACXAC 1S60 

AAGTCAATAG ATYTGRGTTT AATTGGAAAA ATGTGGTGTT TCCTTTGAGT TTATTTCTTC 1620 

CTAACATTGG TCAGCAGATG ACACTGGTGA CTTGACCCTT CCTAGACACC TGGTTGATTG 1680 

TCCTGATCCC TGCTCTTAGC ATTCTACCAC CATGTGTCTC ACCCATTTCT AATTTCATTG 1740 

TCTTTCTTCC CACGCTCATT TCTATCATTC TCCCCCATGA CCCGTCTGGA AATTATGGAG 1800 

RGTGCTCAAC TGGTAAGGA6 AAOGTAGAAG TAGCCCTAGG GATCCTTTTT GAAACTCTAC 1860 

AGTTATCGCA GATATTCTAG CTTCATTGTA AGCAATCTAG GAAATAAGCC CTGCTGCTTT 1920 

CTAGAAATAA GTGTGAAGGA TAAATTTTCT TTCTTGACCT ATGAAGATTT TAGAGTTTAC 1980 

CTTCATATGT TTGATTTTAA ATCAGTGTAT AATCTAGATG GTAAAAAATG TGAAATTGGG 2040 

ATTAGGGACC TACX»AAATA TTTCATTAAT GCTTTCAATT GACAAATTTT GGCCTTTCTT 2100 

TGATAAGACA ATATGTACAT GTTTTTTCAA ATATTAAAGA TCTTTTAACT GTTGGCAGTT 2160 

GTTATCTACA GAATCATATT TCATATGCTG TGTAGTTTAT AAGTTTTTCC TCTATTTATC 2220 
AGAATAAAGA AATACAACAT A0CT6TAAA 



Seg 10 NOs 263 Protein sequence: 
Protein Accession NP_00377S 



1 11 21 31 41 51 

MASU\AANAE PCFNLFREMD DNQGHGMVPF SSLSLFAALA LVRLGAQIMJS LSQIDKLIflV 60 

NTASGYGNSS NSQSGLQSQL KRVFSDINAS HKDTOLSIVir GLPAEKVYGP HKDYIECAEK 120 

LYDAKVERVD FTNHLEDTRR NINKWVENET H6KIKNVIGE GGISSSAVMV LVNAVYFKGK 180 

WQSAPTKSET INCHFKSPKC SGKAVA^!MHQ ERKFNLSVIE DPSMKILELR YNGGINMYVL 240 

LPENDLSEIE NKLTTONLMB WTNPRfiKTSK YVEVFFPQPK lEKNYEMKQY LRALGLKDIP 300 

DESKADLSGI ASGGRLYISR MMHKSVIEVT EEGTEATAAT GSNIVBKQLP QSTLPHADHP 360 
FLFVIRKDDI ILFSGKVSCP 



Seq ID NO: 264 DNA sequence 
Nucleic Acid Accession #: ABOS2906 
Coding sequence: 74>814 

1 11 21 31 41 SI 

AAAACCTTGA GGTGATTCAT CTTCCAGGCT CTCXTTTCCAT CAA6TCTCTC CTCCCTAGCG ' 60 

CTCTCGGTCC TTAATCGCAG CAGCCGCCGC TACCAAGATC CTTCTGTGCC TCCCGCTTCT 120 

GCTCCreCTG TCCGGCTGGT CCCGGGCTGG GCGAGCCGAC CCTCACTCTC TTTGCTATGA 180 

CATCACCGTC ATCCCTAAGT TCAGACCTGG ACCACGGTGG TGTGCGGTTC AAGGCCAGGT 240 

GGATGAAAAG ACTTTrCTTC ACTATGACTG TG6CAACAAG ACAGTCACAC CTGTCAGTCC 300 

CCTGGGGAAG AAACTAAATG TCACAACGGC CTGGAAAGCA CAGAACCCAG TACTGAGAGA 360 

GGTGGTGGAC ATACTTACAG AGCAACTGCG TGACATTCAG CTGGAGAATT ACACACCCAA 420 

GGAACCCCTC ACCCTGCAGG CCAGGATX3TC TTGTGAGCAG AAAGCTGAAG GACACAGCAG 480 

TGGATCTTGG CAGTTCAGTT TCX5ATGGGCA GATCTTCCTC CTCTTTGACT CAGAGAAGAG 540 

AATGTGGACA AOGGTTCATC CTGGAGCCAG AAAGATGAAA GAAAAGTGGG AGAATGACAA 600 

GGTTGTGGCC ATGTCCTTCC ATTACTTCTC AATGGGAGAC TGTATAGGAT GGCTTGAGGA 660 

CTTCTTGATG GGCATGGACA GCACCCTGGA GCCAAGTGCA GGA GCAC CAC TCGCCATGTC 720 

CTCAGGCACA ACCCAACTCA GGGCCACAGC CAOCACCCTC ATCXTTTTGCT GCCTCCTCAT 780 

CATCCTCCCC TGCTTCATCC TCCCTGGCAT CTGAGGAGAG TCCTTTAGAG TGACAGGTTA 840 

AAGCTGATAC CAAAAGGCTC CTGTGAGCAC GGTCTTGATC AAACTCGCCC TTCTGTCTGG 900 

CCAGCTGCCC ACGACCTACG GTGTATGTCC AGTGGCCTCC AGCAG ATCA T GATGACATCA 960 

TGGACCCAAT AGCTCATTCA CTGCCTTGAT TCCTTTTGCC AACAATTTTA CCAGCAGTTA 1020 

TACCTAACAT ATIATGCAAT TTTCTCTTGG TGCTACCTGA TGGAATTCCT GCACTTAAAG 1080 

TTCTCGCTGA CTAAACAAGA TATATCATTT TCTTTCTTCT CTTTTTCTTT GGAAAATCAA 1140 

GTACTTCrrr GAATGATGAT CTCTTTCTTG CAAATGATAT TGTCAGTAAA ATAATCAC3GT 1200 

TAGACTTCAG ACCTCTGGGG ATTCTTTCXX; TGTCCTGAAA GAQAATTTTT AAATTATTTA 1260 

ATAAGAAAAA ATTTATATTA ATGATTCTTT CCTTTAGTAA TTTATTGTTC TCTACTGATA 1320 
TTTAAATAAA GAGTTCTATT TCCCAAAAAA AAAAAAAAAA A 



Seq ID HO: 265 Protein sex^encet 
Protein Accession S: BAB61048.1 



288 



30 



WO 02/086443 PCT/US02/12476 

1 11 21 31 41 51 

iLaaaatkii* lclplli*lls GWSRAGRADP HSLCYDITVI PKFRPGPRWC AVQGQVDSKT 60 
FLHYDOGSKT VTPVSPLGKK LSVTTAWKAQ NPVtaEWDI LTEQJjBDIQL ENYTPXEPLT 120 
5 LQARMSCEQK AEGRSSGSKQ FSFDGQIFXiI« FDSCKRKWTT VHFGARKNKB RHBIDXWMI 180 
SPHYFSKGDC IGWLEDFM5G KDSTI.BPSAG APLAMSSGTT QLRATATTLl LCX2iIIIiPC 240 
FILPGI 

Seq ID KO: 266 ONA sequence 
10 Wucleic Acid Accession S: XM_084853.l 
Coding sequence: 127-444 

1 11 21 31 41 51 

15 ATTGATGATA TATTTAACXSA AATCAAATTT GGTGAATATG TGQACACTGG AA AGCTAATC 60 

GACAAGATCA ACTTACCAGA TTTCCTAAAA GTGTACCTTA ACCACAAGCC ACCTTTTCGT 120 

AACACCAtGA GTGGCATCCA CAAGAGCTTT GAGGTGCTCG GTTATACCAA CTCCAAAGGG 180 

AAAAAQGCCA TTCGAAGAGA GGACTTCCTG AGACTGCTCG TTACTAAAGG TGAGCATATG 240 

ACX3GAGGAGG AGATCTTGGA TTGCTTTGCT TCACTGTTTG GCCTGAATCC CGAGGGATGG 300 
20 AAATCCGAGC CTGCAACCTG CTCCGTCAAA GGTTCAGAAA TTTGCCTTGA AGAAGAACTT 360 

CCAGACGAAA TCACTGCAGA AATATTCGCG ACTGAAATTC TTGGCTTAAC CATT TCAGA A 420 

GATTCCGGCC AGGATGGTCA GTGAAGTTAC CACGAATGTT TAAAGCACAA AGG ACTTTGG 480 

GTGTGTGT6C AaXSCACATGT GTGTGTTTTC • CATGAGGCAC TGCTTmAT GCATTTCCCT 540 

COCCCCTCTC ATCTTCAGAA CATTTACACA TTAAAQCAAG TTTCTGGTGA GCAATG 

25 

Seq ID NO: 267 Protein sequence > 
Protein Accession 9- XP_0848S3.1 

1 11 21 31 41 SI 

MSGIHKSFEV LGVTNSKGKK AIRREDFLRL LVTK6EHMTE EEMLDCFASL PGL»PBGWKS 60 
EPATCSVKGS EICLEEELPD EITAEIFATE ILGLTISEDS GQDGQ 

35 Seq ID KO: 268 02iA sequence 

Nucleic Acid Accession ft: 1IM_001898 
Coding sequence: 57-482 

1 11 21 31 41 51 

40 I I I I I I 

GGCTCTCACC CTCCTCTCCT GCAGCTCCAG CTTTGTGCTC TGCCTCTGAG GAGACCATGG 60 

CCCAGTATCT GAGTACCCTG CTGCTCCTGC TGGCCACCCT AGCTGTGGCC CTGGCCTGGA 120 

GCCCCAAGGA GGAGGATAGG ATAATCCCGG GTGGCATCTA TAACGCAGAC CTCAATGATG 180 

AGTGGGTACA GCGTGCCCTT CACTTCGCCA TCAGCGAGTA TAACAAGGCC ACCAAAGATG 240 

45 ACTACTACAG ACGTCCGCTG CGGQTACTAA GAGCCAGGCA ACAGACCGTT GGGGGGGTGA 300 

ATTACTTCTT CGACGTAGAG GTGGGCCGCA CCATATGTAC CAA GTCCCA G C CCAACT TGG 360 

ACACCTGTGC CTTCCATGAA CAGCCAGAAC TGCAGAAGAA ACAGTTGTGC TCfTTCGRGA 420 

TCTACGAAGT TCCCTGGGAG AACAGAAGGT CCCTGGTGAA ATCCAGGTGT CAAGAATCCT 480 

AGGGATCTGT GCCACGCCAT TOGCACCAGC CACCACCCAC TCCCACCCCC TGTAGTGCTC 540 

50 GCACCCCTGG ACTGGTGGCC CCCACCCTGC GGGAGGCCTC CCCATGTGCC TGCGCCAAGA 600 

GACAOACAGA GAAGGCTGCA GGAGTCCTTT GTTGCTCftGC AGGGCGCTCT GCCCTCCCTC 660 

CTTCCTTCTT GCTTCTAATA GCCCTGGTAC ATGGTACRCA CCCCCCCACC TCCTGCAATT 720 

AAACAGTAGC ATCGCC 

55 Seq ID NOi 269 Protein sequence: 
Protein Accession # :NP_001889.1 

1 11 21 31 41 51 

60 MAQYLSTLLL llATLAVALA HSPKEEDRII PGGIYNADLN DEWVQRALHF AISEYNKATK 60 
DDYYRRPLRV LRARQQTVGG VNYPPDVEVG RTICTKSQPN LDTCAFHEQP ELQKKQLCSF 120 
EIYEVPWENR RSLVKSRCQE S 

Seq ID NO: 270 DNA sequence 
65 nucleic Acid Accession #> XM_093210 
Coding sequence: 13-1854 

1 11 21 31 41 51 

70 ATGGCAAGCG CCGGAATCTC CTCAGCTGOC GTTTCACAAA AGAGGTACCA GGTCCGCACC 60 

AAACGAGCAC ACAAGCACCA CCAGGAGCTG CAGAAGAAGG AGGCGGCAGC GATGGACCAG 120 

GGCAGAGGGA ATGGGGAGGG GGCATCCTAC CCCATATCTG AGGTGCGACT GCGGGACGTA 180 

GAGGGGACTG GGCCTTTCCC GTTGGCGCGT GGCCTCAATC AGGACTTCTT GC0CA0GT6C 240 

GCCTTCAAAA CGGTAAGAGC TGCAACTGAA CGTGTGAGAC ATGGTGCAGA TAOGCTGAGA 300 

75 GGCGGCGGGA GAGATGCCCA TGAACTCAAG TACCCGGACA CGCCCTCCAC TTCTACCACC 360 

ACGAGTAACA CCGCCCCCAC GGGACCGCTC TCGAGGTCCC CCAAGCCAAG GACGCAAGGA 420 

GGAACGCCCC G6G60G0GGC CAGCAGG6GC GGGCACCG6C CCAATGGCCA CGGAACTCAG 480 

CACTGGCAGT CGGCOCTCCT CACACOGCAG GOGTGCAGTG TGGCOSAGGG AGCCTCCOGG S40 

GCCGAGGACC CAGCTAGGCX: GTCACCCOOG TTGCTOCCAC G6GAAGGGGC AOCAGGCAAA 600 

80 CTGCCCAAGG CCCCGAGCCC AGGCTOCCTG GCGGAGGCCT CCGCTGGTCC CGCCCAGATC 660 

ATGGCCGCCA CCAGGCTCCC GAGCCATGGC TTCCTGTCCG GGAAOGGCCC GGOGTCCTGG 720 

CTGTCCAGCT AO 

Seq ID NO: 271 Protein sequence: 
85 Protein Accession XP_093210 

1 11 21 31 41 SI 



289 



10 



15 



20 



WO 02/086443 
1 I I I i I 

MLSHG3QKRK RARKKKDFLP TCAFKTVRAA TBRVRBSADR LRGGG aPAHS LKYHJTPSTS 
TTTSNTAPTG PLSRSPKPRT QGGTPRERPA AAGTRAKGEG TQHBQSAUiT PQACSVADGA 
SRAEDPARPS PRLLPREGAP GSLPKAPSPG SLAEASAGLL AHVRLONAOA Q3VSZSQALP 
FH5SVGRSEE RPGAfiQQRRA PAPMATBLST GSHPSSHRRR AVWPTEPPGP RTQLBPSPBL 
LPREGAPGKL PKAPSPGSLA EASAGFAQIM AATRLPSRGP I.SG1IGPASHL SS 

Seq £0 NO: 272 DHA sequence 

Kucleic Acid Accession ft: Eos sequence 

Coding sequence: 1..732 



PCT/US02/12476 



1 
1 

GGATACTGTG 
TGAAAAAGCT 
TAATGTGGAG 
ATACCCACTT 
ATGATTTTGT 
TAAATTATTT 
TTAGTATCAC 
AAAATTGCAG 
TTAAGCC 



11 
1 

TCACTCAAAG 
TTTTTTCCCA 
GAAATTATTC 
GAAGCCTCTG 

TTATTTATCT 
AATTTATGGG 
AAGTCATAGG 



21 
I 

TAATGGGA06 
CTTTTAACTT 
TTTCTCATTG 
TAGAAATGTC 
CAGTGAGAAA 
TTCATATAGT 
AGAGGGTTTT 
ACTGTCATGT 



31 
I 

GAGAGAGAAC 
GCTTTAGCGT 
GAGATTACAG 
TCGTCCTCCG 
TTACATCCAT 
TCTTACAATT 
TTGTATTTTT 
ATTG^GCTC 



41 
I 

AGGGAGGGTA 
TAAGAGTACr 
AATATATCTA 
GTTGTATTTC 
AGCAAAGACA 
TCTAAAAAAT 
AAGCATATGT 
TGA6AA0CAA 



51 
I 

GGGATGCTTT 
TACCAGCTAA 
TTCATCTTGA 
TAAAACCTAC 
AAAGTCTTTT 
TAACACTCAT 
GGCTTATATA 
TGCCTGAAAC 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
480 



25 
30 

35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NO: 273 Protein sequence: 
Protein Accession ft: Eos sequence 

1 11 21 31 

{ill 
MGGRENREGR OAFBKAPFPT FKLL 

Seq ID NO: 274 DNA sequence 

Nucleic Acid Accession ft: NM_003976.2 

Coding sequence: 299-961 



51 
I 



CTCTGAGCTT 
CATGGAGTTG 
CTACTTCTGC 
GGGTGGCAGG 
CAGGAOGGTG 
GGAACTTGGA 
TGCCCTGTGG 
GGGCTCCGCG 
OGCOGGCCAC 
GCCGCCGCAG 
CGGGGGCCGC 
GGGCTGCCX3C 
CGACGAGCTG 
CGACCTCAGC 
GOCCGTCAGC 
CAACAGCACC 
AGGGCTCGCT 
CCrCCCGCAG 
AGGCCCCTAC 
CAGOCCCAGA 
GGAGCCCTTC 
CCCTCCTCTG 
ACA6CATTTG 
CCTGTACTCA 



11 
1 

CTCT6AGCCT 
TGAAAGAATA 

tgggttgagt 
ccggtccccc 

GGGGAACAGC 
CTTGGAGGOC 
CCCACCCTGG 
CCCCGCA6CC 
CT6C0GGGG6 
CCTTCTCGGC 
GGGGCGOGGG 
CTGCGCTCGC 
GTGOGTTTCC 
CTGGCCAGCC 
CAGCCCTGCT 
TGGAGAACCG 
CCAGGGCTTT 
AGTCCCACTA 
CGGTGGGTGA 
GCCCTCACCC 
GGACCCACTT 
ATGAACACTA 
AAGGACACAT 
CTCATGGGAG 



21 
1 

T6TTTGCTCA 
GCTGCAAA6C 
CTAGCTGTGT 
ACAAAAGATA 
TCAACAATGG 
TCTCCACGCT 
CCGCTCTGGC 
CTGCCCCCCG 
GACGCACGGC 
CGGC6CCCCC 
CTGGGGGCCC 
AGCTGGTGCC 
GCTTCTGCAG 
TACTGGGCGC 
GCCGACCCAC 
TGGACC6CCT 
GCAGACTGGA 
GCCAGCGGCC 
TGGATATCAT 
TGCGGATCCC 
CTCACAGACT 
CAGTGGCTGA 
ATT6CAGTTG 
CTGGCCCC 



31 

I 

TCTGGAAAAA 
ACCTAACACA 
AGGCCCCTTG 
ACTCATCTCT 
CTGATGGGCG 
GTCCCACTGC 
TCTGCTGAGC 
CGAAGGCCCC 
CCGCTGGTGC 
GCCGCCTGCA 
OGGCAGCCGC 
GGTGCGCGCG 
CGGCTCCTGC 
CGGGGCCCTG 
GGGCTACGAA 
CTCOGCCACC 
CCCTTACCGG 
TCAGCCAGGG 
CCCCX3AACAG 
AGCCTAAAAG 
CTGGCACTGG 
GGCATCAGCC 
CTTGGTTGAA 



41 
I 

GGGGATTAAA 
TAGTAAGGTT 
TTCCTCACCT 
TAATTTGCAA 
CTCCTGGTGT 
CCCTGGCCTA 
AGCGTCGCAG 
COGCCTGTCC 
AGTGGAAGAG 
CCCCCATCTG 
GCTCGGGCAS 
CTCGGCCTGG 
CGCCGCGOGC 
CGACCGCCCC 
GCGGTCTCCT 
GCCTGCGGCT 
TGGCTCTTCC 
ACGAAGGCCT 
GTGAAGGGAC 
ACACCAGAGA 
CCAGGCCTCG 
CCCGCCCAGG 
AGTGCCTGTG 



51 
1 

(XATTTACCT 
CCCAGTGCAG 
GGAGAAACTG 
GCTGCCTCAA 
TGATAGAGAT 
G6CGGCAGCC 
AGGCCTCCCT 
TGGCGTCCCC 
CCOGGOGGCC 
CTCTTCCCCG 
GGGGGGCGCG 
GCCACCGCTC 
GCTCTCCACA 
CGGGCTCCCG 
TCATGGACCT 
GCCTGGGCTG 
TCCCTGGGAC 
CAAAGCTGAG 
AACTGACTAG 
CCTCAGCTAT 
AACCTGGGAC 
CCCTGTAGGG 
CTGGAACTGG 



Seq ID NO: 275 Protein sequence: 
Protein Accession ft: NP_003 967.1 

1 H 21 31 41 SI 

MELGLGGLST LSHCPWPRRQ PALWPTLAAL ALLSSVAEAS LGSAPRSPAP RBGPPPVLAS 
PA6HLPGGRT ARHCSGRARR PPPQPSRPAP PPPAPPSALP RG6RAARAGG PGSRARAAGA 
R6CRLRSQLV FVRALGLGHR SDELVRPRFC SGSCRRARSP HDLSLASLLG AGALRPPPGS 
RFVSQPCCRP TRYEAVSPMD VNSIWRTVDR LSATAOGCtXa 

Seq ID NO: 276 DNA sequence 

Kucleic Acid Accession ft: NM_0S7091.1 

Coding sequence: 783-1445 



ACTGGCCGCT 
GGACCCCCAA 
TCGCTCCCOG 
CGCGTGTCTA 
CTCCATATCC 
CAAGCTAGGG 
OGGGGCAGGG 
CACCQGAOGG 
CAGACAAGGC 



11 
1 

GAGAGAAGAA 
ATCTGCACGT 
CCCTCACTCA 
CAAACTCAAC 
QAGGGGCCCC 
GGGACTGGAT 
GOGCTCCCAG 
CTGCG60GGC 
C0GG66GCTC 



21 
I 

TCGOGTGGAG 
ACCAGCAGTC 
CTTTCTCCCG 
TCCCGGTTTC 
TCCCAGCATC 
COGACGGGTG 
CCCCACCCCG 
GGGCAGGAGG 
CGCCAGCRGC 



31 
I 

CAGAGAGCAG 
AGCC6CCCCA 
CCCTCGGCCC 
CGTGCCTCTC 
TACCCCCCTC 
GAGCAGCCAG 
GGATCTGGTG 
CTGCTGAGGG 
A0GTCCCTG6 



41 
i 

CTGCTGCAGG 
OGCAGGGACC 
GGCCTCCCAS 
CAC06CTCGA 
CCAACCTCGG 
GTGAGCCCCG 
ACGCTGGGGC 
ATGGAGTTGO 
GGCCCCAGCC 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



60 
120 
180 



51 
I 

GCAGAOIGCC 
GGCTTACCCC 
CTCTCTACTT 
GTTCTCTACT 
GGGACCTAGC 
AAAGGTGGGG 
TGGAATTTGA 
GOOOGGOCCC 
CTCGCTGOCA 



60 
120 
180 
240 
300 
360 
420 
480 
540 



290 



wo 02/086443 



PCTAJS02/12476 



10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 

85 



CCCGGGCXriG 
TAAAAGAOGC 
GCX3CAfiCACT 
TCAACAOGAG 
AGATGGAACT 
AGCCTGOXT 
CCCTGGGCrC 
OCOCCXXX3GG 
(3GCCGCCGCC 
CCCCOGQGGG 
OGCGGGGCTG 
GCTCCGACGA 
CACAOGACCT 
OCOGGCCCGT 
AOGTCAACAG 
GCTGAGGGCT 
GGACCCTCCC 
TGAGAGGCCC 
CTAGCAGCCC 
CTATGGAGCC 
GGACCCCTCC 
AGGGACAGCA 
CTGGCCTGTA 



GAGCCCCACA 
ACTGCCAGGT 
GGT0CC0G6A 
GGTGGGGGAA 
TGGACTTGGA 
GTGGCCCACC 
CGCGCCCCGC 
CCAOC TGCro 
GCAGCCTTCT 
CCGCGOGGCG 
CCGCCTGCGC 
GCTGGTGCGT 
CAGOCTGGCC 
CAGCCAGCCC 
CACCTGGAGA 
CGCTCCAGGG 
GCAGAGTCCC 
CTACCGGTGG 
CAGAGCCCTC 
CTTCGGACCC 
TCTGATOAAC 
TTTGAAGGAC 
CTCACTCATG 



CCOGAGGGTG 
GTACAGTCCT 
AAOGTGCCTA 
CAGCTCAACA 
GGCCTCTCCA 
CTGGCCGCTC 
AGCCCTGCCC 
GGGGGACGCA 
OGGCCOGCGC 
OGGGCTGGGG 
TCGCAGCTGG 
rrCCGCTTCT 
AGCCTACTGG 
TGCTGCOGAC 
ACCGT GGACC 
CTTTGCAGAC 
ACTAGCCAGC 
GTGATGGATA 
ACCCTGCGGA 
ACTTCICACA 
ACTACAGTGG 
ACATATPGCA 
GGAiGCTGGCC 



CAGACTGGCT 
GOGCATGGGC 
6AAGAACAAG 
ATGGCTGAT6 
CGCTGTCCCA 
TGGCTCTGCT 
CCCGCGAAGG 
0C3GC0CGCTG 
OCCOGCCGCC 
GCOOQGGCAG 
TGOOGGTGOG 
GCAGCGGCTC 
G06CCGGGGC 
CXy^OGOGCTA 
GCCTCTCCGC 
TGGACCCTTA 
GGCCTCAGCC 
TCATCCCGQA 
TCCCAGCCTA 
GACTCTGGCA 
CTGA GGCATC 
GTTGCTTGGT 
CC 



GCCAAGGCCA 
TCTTTGAGCT 
GTGCStfSGACC 
GGCGCTCCTG 
CTGCCCCTGG 
GAGCAGCGTC 
CCCCCOGCCT 
GTGCAGTGGA 
TGCACCCCCA 
CCGOSCTCGG 
0G06CTCGGC 
CTGCCGCOGC 
CCTGOSACCG 
OGAAGCGGTC 
CACOGCCTGC 
COGGTGGCTC 
AGGGAOGAAG 
ACAGGT6AAG 
AAAGACACCA 
CTGGCCAGGC 
AGCCXCOGGC 
TGAAAGTGCX: 



CACTTTTGGC 
TOGGGGGAGA 
OOGTGCTGCC 
GTGTTGATAG 
CCTAGGCGGC 
GCAGAGGCCT 
GTCCTGGCGT 
AGAGCCCGGC 
TCTGCTCTTC 
GCAGOGGGGG 
CTGGGCXS^ 
GOGCGCTCTC 
CCCCOGGGCT 
TCCTTCATGG 
GGCTGCCTGG 
TTCCTGCCTG 
GCCTCAAAGC 
GGACAACTGA 
GAGACCrCAG 
CT0GAACCT6 
CAGGCCCTGT 
TGTGCTGGAA 



£00 
660 
-320 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 



Seq ID NO: 277 Protein sequence: 
protein Accession ft: NP„003 967.1 

1 11 21 31 41 SI 

ilELGLGGLST LhCPWPHRQ PALWPTLAAL ALLSSVAEAS LGSAPRSPAP RBGPPPVLAS 
PAGHLPGGRT ARWCSGRARR PPPQPSRPAP PPPAPPSALP RGGRAARAGG PGSRARAAGA 
RGCRLRSQLV PVRALGLGHR SDELVRPRFC SGSCRRARSP HDLS1ASLW5 AGALRPPPGS 
RPVSQPCCRP TRYEAVSPMD VNSTWRTVDR LSATAOGOX; 

Seq ID KO: 278 DNA sequence 

Nucleic Acid Accession ft: NM_057160.1 

Coding sequence: 1-714 



1 
I 

ATGCCCEGCC 
CACCTGGGTG 
TGGCCCACCC 
GCGCCCCGCA 
CACCTGCCGG 
CAGCCTTCTC 
CGOGCGGCGC 
OGCCTGOGCT 
CTGGTGOGTT 
AGCCTGGCCA 
AGCX^GCCCT 
ACCTGGAGAA 
GCTCXaVGGGC 
CAGAGTCCCA 
TACCGGTGGG 
AGAGCCCTCA 
TTCGGACCCA 
CTGATGAACA 
TTGAAGGACA 
TCACTCATGG 



11 
I 

TGATCTCAGC 
CCCTCTTTCT 
TGGCCGCTCT 
GCCCTGCCCC 
GGGGACGCAC 
GGCCCGCGCC 
GGGCTGGGGG 
CCCAGCTGGT 
TCCX3CTTCTG 
GCCTACTGGG 
GCTGCCGACC 
CCGTGGACCG 
TTTGCAGACT 
CTAGCCAGCG 
TGATGGATAT 
CCCTGOGGAT 
CTTCTCACAG 
CTACAGTGGC 
CATATTGCAG 
GA6CTGGCCC 



21 
I 

CCGAGGACAG 
CCCTGAGGCT 
GGCTCTGCTG 
CCGOGAAGGC 
GGCCCGCTGG 
CCCGCCGCCT 
CCOGGGCAGC 
GCCGGTGCGC 
CAGCX3GCTCC 
OGCCGGGGCC 
CACGCGCTAC 
CCTCTCCGCC 
GGACCCTTAC 
GCCTCAGCCA 
CATCCCCGAA 
CCCAGCCTAA 
ACTCTGGCAC 
TGAGGCATCA 
TTGCTTGGTT 
C 



31 
I 

CCCCTCCTTG 
CCACrTGGTC 
AGCAGGBTOG 
CCOCOGCCTG 
TGCAGTGGAA 
GCACCCCCAT 
CGCGCTOGGG 
GOGCTOGGCC 
TGCCGCCGCG 
CTGOGACCGC 
GAAGCGGTCT 
ACOGCXTTGCG 
OGGTGGCTCT 
GGGACGAAGG 
CAGGTGAAGG 
AAGACACCAG 
TGGGCAGGCC 
GCCXXrCGCCC 
GAAAGTGCCT 



41 
I 

AGGTCCTTCC 
TCTCCGCGCA 
CAGAGGCCTC 
TCCTGGOGTC 
GAGCCOGGCG 
CTGCTCTTCC 
CAGCGGGGGC 
TGGGCCACCG 
CGCGCTCTCC 
CCCCGGGCTC 
CCTTCATGGA 
GCTGCCTGGG 
TCCTGCCTGG 
CCTCAAAGCT 
GACAACTGAC 
AGACCTCAGC 
TCGAACCTGG 
AGGCCCTGTA 
GTGCTGGAAC 



51 
1 

TCCCCAAGCC 
GCCTGCCCTG 
CCTGGGCTCC 
CCCCGCCGGC 
GCCGCCGCCJG 
CCGCGGGGGC 
GCGGGGCTGC 
CTCCGACGAG 
ACACGACCTC 
CCGGCCOGTC 
CGTCAACAGC 
CTGAGGGCTC 
GACCCTCCCG 
GAGAGGCCCC 
TAGCAGCCXX: 
TATGGAGCCC 
GACCCCTCCT 
GGGACAGCAT 
TGGCCTGTAC 



seq ID NO: 279 Protein sequence: 
Protein Accession 8: np_476501.1 



11 



41 



51 



21 31 

, . I - I 1 I 

MPGIilSARGQ PLLEVLPPQA HLGALFLPEA PLGLSAQPAL WPTLAALALL SSVAEASLGS 
APRSPAPREG PPPVLASPAG HLPGGRTARW CSGRARRPPP QPSRPAPPPP APPSALPRGG 
RAASAGGPGS RARAAGARGC RLRSQLVPVR AliGLGHRSDE LVRFRPCS6S CRRARSPHDI. 
SLASLLGAGA LRPPPGSRPV SQPCCRPTRY EAVSFMDVNS TWBTVDRLSA TAQGCLG 

Seq ID NO: 280 DNA sequence 

Nucleic Acid Accession &i NM_0S7090.l 

Coding sequence: 29-71S 



CTGATGGGCG 
GTCCCACTCC 
GTGGCCCACC 
CGCGCCCCGC 
CCACCTCCOG 
GCAGCCTTCT 
CCGCGCGGCG 
CCGCCTGCGC 
GCTGGTGCGT 
CAGCCTGGCC 
CAGCCAGCCC 



11 
I 

CTCCTGGTGT 
CCCTGGCCTA 
CTGGCCGCTC 
AGCCCTGCCC 
GGGGGACGCA 
OGGCCOGCGC 
CGGGCTGGGG 
TCGCAGCTGG 
TTCOGCTTCT 
AGCCTACTGG 
TGCTGCOGAC 



21 
I 

TGATAGAGAT 
GGCGGCAGGC 
TGGCTCTGCT 
CCCGCGAAGG 
CGGCCOGCTG 
CCCCGCOGCC 
GCCCGGGCAG 
TGCCGGTGCG 
GCAGCGGCTC 
GCGCCGGG6C 
CCAGGCGCTA 



31 
1 

GGAACTTGGA 
TCCACTTGGT 
GAGCAGCGTC 
CCCCCCGCCT 
GTGCAGTGGA 
TGCACCCCCA 
CCGCGCTCGG 
CGCGCTCGGC 
CTGCCGCCGC 
GCEG06AOOG 
OGAAGCGGTC 



41 
I 

CTTGGAGGCC 
CTCTCCGOGC 
GCAGAGGCCT 
GTCCTGGCGT 
AGAGCCCGGC 
TCTGCTCTTC 
GCAGOGGGGG 
CTGGGCCACC 
GOGCGCTCTC 
CCOOOGGGCT 
TCCTTCATGG 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



60 
120 
180 



51 
I 

TCTCCACGCT 
AGCCTGCCCT 
CCCTGGGCTC 
CCCCCGCCGG 
G600GC0GCC 
CCOGOGGGGG 
OGCGGGGCTG 
GCTCOGACGA 
CACAOGACCT 
CCOGGCCOGT 
AOGTCAACAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



291 



CACCTC^I^XSStGGACC GCCTCTCOGC CACCGCCTGC GGCTGCCTGG GCTGAOQGCT 720 

CGCTCCAGGG CTTTGCAGAC TGGAiCCCTTA CCGGTGGCTC TTCCTGCCTG GGACCCTCCC 780 

GCACAGTCCC ACTACCCAGC GGCCTCACCC AGGGAOSAAG GCCTCAAAGC TGAGAGGCCC 840 

CTACOGGTOG GTCAflGGATA TCATCCCOBA ACACGIGAAG GGACAACTGA CTAGCAGCCC 900 

CAGASCCCTC ACCCTGCGGA TOCCAGCCTA AAAGACA<XA GAGACCTCaG CTATGGAGCC 960 

CTTCGGACCC ACTTCTCACA GACTCTGGCA CTGGCCAGGC CTOGAACCTO CSGRCCCXnCC 1020 

TCTGATGAAC ACTACAGTGG CTCAGGCATC AGCXXSXGCC CAGGCCCTGT AGGGACRGCA 1080 

TTTGAAGGAC ACATATTGCA GTrGCTTGGT TCAAAGTGCC TGTGCIGGAA CTGGCCTGEA 1140 
CTCACTCATG OGAGCTGGOC CC 



Seq 10 NO: 281 Protein sequence: 
Protein Accession 8: MP_476431.1 

X 11 21 31 41 51 

1 I I 1 i I 

MSLGUGGLST IiSHCPWPRRQ APIGLSAQPA LWPTIAALAL LSSVAEASU5 SAPRSPAPRE 60 
GPPPVIASPA GHLPGGRTAR WCSGRARRPP PQPSRPAPPP PAPPSALPRG GRAARAGGPG 120 
SRARAAGMiG CHLRSQLVPV RAIiGLGHRSD ELVRFRFCSG SCRRARSPHD LSLASLLGAG 180 
ALRPPPGSRP VSQPCCRPTR YEAVSFWDW STWRTVDRLS ATA06CZ/3 



Seq ID NO: 282 ONA sequence 

nucleic Acid Accession ft: Eos sequence 

1 11 21 31 41 |1 

CTACTGCACC TGCCCTCTCT TTCCTTrGGA AATCTCTTAC CTTTCATTAG GGTTTCrrTC 60 

ATAGCAATTT CCTTTGGTTT TTAAGACTTC TACATTGCTT TTTCTTTTAT TATCTGTGCT 120 

CCGTGAACCr TATGAATGCT GCTTAAAAAT AATGTCAAAA TATGTTTTAG CTGOCTACTC 180 

AfiGTAACGTT TTCTTTTGCT CTCATCTTGG TTTCCATATA CTATTTTTGG TTTTTTGTGA 240 

GATCTAATCA ATGATCTAQT CAGAAGCTAC TTCACrGGCT AACAGTGATC ATGTTCATGT 300 

GCTAAAAATG AACTTGAAAC ACGGAAGTAG TGGTTGGTCC AG TTTGAAAG CTCTTATTAG 360 

TATTCTTCAT CCTGGCTGTA ATAATAGCCA TTATTTGTTA TGCCTTTGTT AIGTA GCftG A 420 

CACTCTTAAG GATTTTATGT GTATTATTCA AATPGCTATT ACTOITCTTT TTATAGTTGA 480 
GAATCTCAGG ATACCTACAT TTATCACTTT TTCAATATAT ATGTATTTCT TATT 



Seq ID tlO: 283 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 564-1481 

1 11 21 31 41 51 

GA6ACTTTTA ATCATCTATC CCTTGTGCrT TAOGCAGACX: CTACAATACA CTAGAGGCTT 60 

CAAAGAGGTC AAAAATTCAC ATGTGTA6AC AAATTAGGTC CCTTAAGATG CCAGGCAAAC 120 

GAAGTGCTAC CAAAACACGC AATGACTGTC CTAAAAGTGC GTTCTQGGAT ACACCTGTAA 180 

ACTTGGATCA AGTTCCCTCC CCTCTCCTCA AAATATATCG ACTTGTGCTO AAAGAAATCA 240 

GGAOCGATGC TCACAATTCT GACCTCGTAA TTATATAGGG GGTGGTTTTG GTTTCTGCGT 300 

CTTTCCCTGA TTCAGTGGCA GGTAACATAT TTCATGTACA AAATGAACTG CAACACCACG 360 

GCAAACAAGG GACAGGCCCT CAAAGTTGTC GGTAGQGAGC CAGGACOCOG CCAGTGGCGT 420 

GGGGAGACAC OGTACTAAAC AAGCTTGCAA ACAGCAGGCA CCTTCCTGCC ACTGAGGAGG 480 

AAOaGCTGGC TAAGGGAGGC CX5GQG0GGAG GAAGOCAAGC TCTGCAlGGCC CTGACAAAGT 540 

CCTCCCGOCC TCCACGCGTC GCCATGGCAA OGGGGGGTCT GTGCTGGCCG GGATTGGCCG 600 

GCCTGGCGCG CGCAGGGCCC GCTGGGAAAG CGCGTCCCCG CC GOGGC TCC GCCAGTTTGA 660 

ACTTGGCGGG CCAGATGTGG GCGGCGGGGC GCTGGGGGCC TACTTTTCCC TCTTCCTAOG 720 

COGC?mCTC TGCTGACTGC AGACCCAGGT CTCGGCCCTC CTCGGACTCC TGCTCAGTCC 780 

CTATGAOGGG C6CA0GTGGG CAGGGGCTGG AGGTGGTGCG CTCGCCGTOG CCGCCGCTGC 840 

OGCTGAGCTG CAGCAATTCC ACCAGGTCGC TGTTGTCTCC CCTTGGCCAC CAGAGCTTCC 900 

AGTTTGACGA GGACGACGGT GACX3GGGAGG ATGAGGAAGA OGTGGATGAT GAGGAAGACG 960 

TGGATCAAGA TGCCCATGAT TCAGAGGCCA AAGTG6CGAG CXTEGAGAGGA ATGGAGTTAC 1020 

AGGGGTGCGC CAGCACTCAG GTTGAATCAG AAAATAACCA AGAAGAACRG AAACAGGTGC 1080 

GCTTACCAGA AAGCCGCCTG ACACCATGGG AGGTGTGGTT TATTG6CAAA GAAAAAGAAG 1140 

AACGTGACCG GCTGCAACTG AAAGCTCTAG AGGAATTAAA TCAACAACTA GAAAAAAGAA 1200 

AAGAAATG6A AGAAOSTGAA AAAAGAAAGA TAATTGCTGA AGAAAAGCAC AAGGAATGGG 1260 

TTCAGAAAAA GAATCAGCAA AAAAGAAAAG AAAGAGAACA AAAAATTAAT AAAGAAATGG 1320 

AGOAAAAAGC AGCaAAGGAA CTGGAGAAAG AATACTTGCA AGAAAAAGCA AAAGAAAAAT 1380 

ATCAAGAATG GTTAAAGAAA AAAAATGCTG AAGAATGTGA GAGGAAGAAG AAAGAAAAGA 1440 

AAAACAACAG CAAGCTGAAA TACAGGAGAA AAAGGAAATA GCAGAAAAAA AGTTTCAAGA 1500 

ATGGTTCGAA AATGCGAAAC ATAAACCTCX3 TCCAGCTGCA AAGAGCTATG GTTATGCCAA 1S60 

TCGAAAACTT ACAGGTTTTT ACAGTGGAAA TTCCTATCCA GAACCAGCCT TTTATAATCC 1620 

AATTCCGTGG AAAOCAATTC ATATGCCACC TCCCAAAGAA GCTAAGGATC TATCAGGAAG 1680 

GAAGAGTAAA AGACCTGTGA TAAGTCAGCC ACACAAGTCA TCATCTCTGG TAATTCATAA 1740 

AGCCAGGAGC AATCTTTGCC TTGGAACTCT GTGCflGAATA CAAAGATAGC GTAT GTGGAA 1800 

AATAACATGC TTTTATCTGG AGCTATTTAA TTTAAAAATC AGAA ATTGTT TTTTACTGCT 1860 

CAGTCAATAA CTCAACACTT AATGTGATTA TTGACAAATA GCAATTTTTG ^ATTTCTATA 1920 

TCGAGTCCTT AGAGTTGAGG AAGATATTTT CTGGATTTTG GTrTTTATAA ACTTTTTAAG 1980 

GTTGATCTTG GCATCTTGTT TTGCAGAATA AGTGGCTGAA TATGTAAGAA TTGTGTTTGT 2040 
ATTTAGCTTG TATTAAAAGT ACACTGTAAT ACCAATAAAA CTAACAATTT TTCTTG 



Seq ID NOi 284 Protein sequence: 
Protein Accession ft: Eos sequence 



1 11 21 31 41 SI 

ilATRGtCWPG LAGLARAGPA GKARPRRGSA SLNLAGQMWA AGRWGPTPPS SYAGPSADCR 60 

PRSRPSSDSC SVPMTGARGQ GLEWRSPSP PLPLSCSNST RSLLSPLGHQ SFQFDEDDCT 120 

GEDEEDVDDE EDVDEDAHDS EAKVASLRGM ELQGCASTQV ESENUQEEQK QVRIiPBSRLT 180 

PWEVWPIGKE KEBRDRIiQIiK ALEELNQQLB KRKEMBEREK RKIIAEEKHK EWVQKKNEQK 240 

RKEREQRINK EHEEKAAKEL BKEYLQEKAK EKYQEWLKKK NAEECERKKK EKKNNSKLKY 300 



292 



wo 02/086443 

RRKRK 

Seq ID KO: 285 DNA sequence 

tmcleic Acid Accession 8: Eos sequence 

Coding sequence: 1-1746 



PCTAJS02/12476 



1 11 21 

I I ! 

ATGCCACTGA AGCATTATCT CCTTTTGCTG 
10 GCCTACCATG GCTGCCCTAG CGAGTGTACC 
GGGGCAOCCA TTGTGGCGGT GCCCACCCCT 
CTCAACAGGC ACATCACTGA ACTCAATGftG 
GCCCTGAGGA TTGAiGAAGAA TGAGCTGTOG 
GGCTCGCTGC GCTATCTCAG CCTCGCCAAC 
15 TTCCAQGGCC TGGACAGCCT TGAGTCTCTC 
CAGCCGGCCC ACTTCTCCCA GTGCAGCAAC 
CTGGAATACA TCCCTGAOGG AGCCTTCGAC 
GGCAAGAATA GCCTCACOCA CATCTCACCC 
GTCCTCOGGC TGTATGAGAA CAGGCTCACG 
20 GTTAACCTGC AGGAACTGGC TCTACAGCAG 
TTCCACAACA ACCACAACCT CCAGAGACTC 
CCACCCAGCA TCTTCATGCA GCTGCCOCAG 
CTGAAGGAGC TCTCTCTGGG GATCTTCGGG 
TATGACAACC ACATCTCTTC TCTACCCGAC 
25 GTOCTGATTC TTAGCCGCAA TCAGATCAGC 
AOGOAGCTTC GGGAGCTGTC CCTGCACAOC 
TTCCGCATGT TGGCCAACCT GCAGAACATC 
CCAGGGAATA TCTTCGCCAA CGTCAATGGC 
CTGGAGAACT TGCCCCTCGG CATCTTCGAT 
30 TATGACAATC CCTGGAGGTG TGACTCAGAC 
AACXAGCCTA GGTTAGGGAC GGACACTGTA 
GGOCAGTCCC TCATTATCAT CAATGTCAAC 
GTGCCTAGTT AtXCAGAAAC ACCATGGTAC 
TCCGTCTCTT CTACCACTGA GCT AACC AGC 
35 ATTCAGGTCA CTGATGACCG CAGCGTTTGG 
ATTGCCGCCA TTGTAATTGG CATTGTCGCC 
TGTTGCTGCT GCAAGAAGAG GAGCCAAGCT 
TGTTAAAGAG GCAGGCTGGA GCAGGGCTGG 
TCATCTTTCT GCCTCCACCC CTGGGTCCAT 
40 CTAGATAAAG GTGTGCCTAC CTCTTCCTGA 
CGTGCCGGAC CTTCCTACAA TCAGGAAGAT 
GGATTTCCGA TTCATACX:CC TGGGCTTCCT 
ACCtGTCCTC CAAGAACAGC CTTCCCTGCG 
AOTTAGTCCA CAGCCTGCTC ACTTCGTGGG 
45 CCTAAGTATT ATGTAAGTTG ATTTCCCTTC 
ACCCAGCATG TCCXXTTCAAA TGAAAGTTCT 
TGAGTTCTCT CCTCAAAGAA GACTTCAAAC 
CAGCCTGGTT TTGGGGATGC TATGAAAGAG 
AGACAGAAGA GCCGTCATCA GTGTCTCACT 
50 CCCCAGCACA GCAAGCTCAG CCTTTTAGAG 
TGAAAAGTTT AGCCCTTTAA GGAATGAAAT 
AAAATCAGCT TATTAATACG GGATAGAGAA 
CACCCCTAGA GTTTGTTTTA AAATTTTTAA 
GTGGGAACAT GATAGTGTAT GGCTTGGTGG 
55 CAGCATCTAG ACCCAGACCX: AGAGCATCAC 
GGAGATGGGG GCTTCTGAAG ATGGACTTAC 
TCCCXXXaCA GTCAGCCTGT GCAAAGGCCC 
TGT6GACAGG ATGGGAGACT GTGGCCTGAA 
AGAGACCCTG AGACCTGGGG CACCATGGCT 
60 GTCOGTGCAG CCACACCCTC TTCCCTGCCA 
TCCGCCTGGA GCCTTCTATG GACGTGATAT 
ACTTAGGGGA AGTGAAATCG CTCAGAGATG 
GAATCTAGTG TCTTTCTAAT GTGGTAAAAT 
TGAACTTCAG AATCTCACTT ACAGCAGGCG 
65 GTCTGGGGGC TCCClfeCAGC TCCTCCTGCXS 
TCCA6GGTTA TTCTCCTCCT GGAGTCACAG 
CTGCTATACA CATATTCACA TGGCGCTCAA 
CTCTGGACAA CTGGCCCAGT TTACAGT6AA 
AGGAAAGAAC TTCAGCTGAC TCCAOKSGGA 
70 TCTTATPAGC TCCCCGCTCC ACAAGACACC 
TCGGCTCTTA TTAGCTCCCC GCTCCACAAG 
CCCGATGGGC TCTTATTAGC TCCCCGCTCC 
CAGGAGCAOQ TGCTGACCAG TTTTCCCTTC 
TGTTTGCAAA CACTAGTGCA CTTTGTAGCT 
75 AGATGAGGCC OGTCAGAGTC AAGAGATGTC 
ACTATTGGTG GCACCTGGAG GACATGCACC 
CCAGAGCATG GCACATGAGC ATCACCCGCT 
GGGGCATCCC GGCCCGTACC CCTCCAGACA 
GGTGCTCCTG TGAGTGGCCT CCAGATGTCT 
80 GAGGGAGGTG GGAAACCTCA TCATCCGGTG 
G6TATTCCTG GCAGTAGCCA TGACATTGGA 
6AGGGCCACT GTCCTCAGAT GACAC CACC C 
CCTTATGTGA ACCTCTTGCC TCTTCCTTTC 
GCCTCCTTTT CTTCAGCGGG CCCTTCAACC 
85 TACTAGAAAA GCTGAGTGGA GTCTCCTTTC 
GGGCTGGAAT GAGCCGGCTG GTCCCCCAGA 
CCTCTCTGTT TACAGCTCCT TGACAGTCCC 



31 41 

I I 
GTG0GCT6CC AAGCCTGGGG 
TGCTCCAGGG CCTCCCAGGT 
CTGCCCTGGA ACGCCATGAG 
TCCCCGTTCC TCAATATCTC 
OGCATCACGC CTGGGGCCTT 
AACAAGCTGC AGGTTCTGCC 
CTTCTGTCCA GTAACCAGCT 
CTGAAGGAGC TGCftGTTGCA 
CACCTGGTAG GACTCACGAA 
AGGGTCTTCC AGCACCTGGG 
GATATCCCCA TGGGCACTTT 
AACCAGATTG GACTGCTCTC 
TACCTGTCCA ACAACCACAT 
CTCAACCGTC TTACTCTCTT 
CCCATGCCCA ACCTGCGGGA 
AATGTCTTCA GCAACCTCCG 
TTCATCTCCC CGGGTGCCTT 
AAC6CACTGC AGGACCTGGA 
TCCCTGCflGA ACAATCGCCT 
CTCATGGCCA TCCAGCTGCA 
CACCTGGGGA AACTGTGTGA 
ATCCTTCCGC TCCGCAACTG 
CCTGTGTGTT TCAGCCCAGC 
GTTGCTGTTC CAAGCGTCCA 
CCAGACACAC OCAGTTACOC 
CCTGTG6AAG ACTACACT6A 
GGCATGACCC AGGCCCAGAG 
CTGGCCTGCT CCCTGGCTGC 
GTCCTGATGC AGATGAAGGC 
GGAATGATGG GACTGGAGGA 
GGA6CTTTCC OGTGATTGCT 
CTTGCCTGAT TCTCCCGTAG 
AGATCCAACT GGCCATGGCA 
TCGAGAGGGC TCTTCCTCCA 
CCCAGGCCCC CTOOGGGCCT 
AATAGTTCTC OGCTGAGATA 
TTTTGTTTCT CTTGTTTGTG 
CCCCTTGATT TTCTGCTCCT 
CATTTAACTG GTTTCTTAAG 
AGAAGGAAAA TCATGGOGCT 
TGTGATTTTT ATCTGGAAAA 
AAGGATATTT CCAAACTGCA 
CATGTAGAAT TTTGGACTTC 
AGAAATCTGG TGCCTGGGGG 
TTGAAGCATG TGAAGTGTAC 
ATTTTCACAA ACTGAACATA 
AAATATCCCC CATCCTGGGC 
CTGGGACCTG CCCCCCATGA 
CGTGGCCAGG GGTGGAGGAG 
CAGGAGATTT TATTATATCT 
GGCCAGGTCA GAAGCATCCT 
GCAAGTTGTC TGCGGCTCAT 
GCCTGTATCT GTTTTTAATT 
AGATCCTTTA ATTGAAAACG 
TCTCCATCAA CATCACAGTC 
ACAOGGGOGT ACACCGATGG 
TGTGOTCTGG TTAGGAGTTG 
TCACACGAAT ACCTGCCTTC 
GAAGTTAGGC TCATGGCAAC 
ATGGAGAATT TCAGGTCTCC 
TCTGGAAATC CACGACCAAT 
TGTGCTTTGG AAATCCACCA 
ACACCTGTGA TCTGGAAATC 
ACAAGACACC TGTGACATCC 
CAGnCCTGC ACAAAAAGT6 
TTTCACCCTC TGTCCCAGGG 
ATCCCCCCAG GGTCTCCAAG 
AAGGCTTGCC AGAGCCAACA 
GATGGTGGCC TGCTGTGCCT 
GGAAGCATGG GTTTGCCCAC 
TTGTGCATAG GCACAAGTGG 
GGCCCTGCCA ATCTTAACCC 
GCACCTTCCT CTCCAGCCAG 
AGGAGCACCC TAGGTGAGGG 
TCCCATCAGA GTGGTTGGAT 
TCTCTGCACC ATGTTGTCTG 
CAACAGGATG ATGCATTTGC 
AAGCTGGAGT GGGQTACAGA 
ACGCCCATCT GGAGTGGGAG 



51 
I 

TGCAGGGTTG 
GGAGTGCACC 
CCTGCACATC 
AGCOCTCATC 
CCGAAACCTG 
CATCGGCCTC 
GTTGCAGATC 
06GGAACCAC 
GCTCAATCTG 
CAATCTCCAG 
TGATGGGCTT 
CCCTGGTCTC 
CTCCCAGCTG 
TGGGAATTCC 
GCTTTGGCTC 
CCAGTTGCAG 
CAACGGGCTA 
CGGGAATGTC 
CAGACAGCTC 
GAACAACCAG 
GCTGOQGCTG 
GCTCCTGCTC 
CAATGTCCGA 
TGTCCCTGAG 
TGACACCACA 
TCTGACTACC 
OGGGCTGGCC 
CTGCGTCGGC 
ACCCAATGAG 
CCTGGGAATT 
CTTTCTGGCC 
AGAAGCAGGT 
AAAGCCCTGG 
AATCCTCCCC 
CTGTAGACTC 
GCCCCTCTCG 
CTATGGCTTG 
GAAGGCAGGG 
AGCCGTCAAT 
CAGTTCCTGG 
GGAAGAAACA 
AACTTTGCTT 
TAAAAACATT 
TCCCTGTGTT 
STGCAGAAAA 
CCTGTGTAAT 
TTTTCCCAGA 
GCCAGGACX3G 
AATATGTGGG 
GGAGACCCTG 
GACTGCAGAG 
CGGAGGCCCC 
TTCATTCTTC 
AAGTGTAACG 
AGCTGGCAGC 
GTCACACTGG 
AGTTGTTTGC 
TCTGGCTTTC 
GTGTGTCTTT 
ACGTCTGCCC 
CCCGATGGGC 
CCAATCCCGA 
TACCACCAAT 
TCCAGGGCCA 
TCGAGAGGGC 
AATCTAGGAG 
GCATTTCCAC 
GGAAGTGAGC 
GGTGCCAACA 
AGACCTGTCG 
GCCAGGGCTG 
AGAACCCTTA 
flCGCTGACCT 
GTGAGGGCCC 
GGAGCCATTG 
GCTGAGGAGC 
TCAATTCTCA 
GTTCAGTTTT 
CTGGGAGTTA 



60 
120 
180 
240 
300 
360 
420 
480 
S40 
600 
£60 
720 
780 
840 
900 
960 
1020 
lOSO 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1660 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
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GTCTIX3GAGA AGAAACAACA AAAGCCAATT AGAACCACTA TTTTTAAAAA GTGCTTACTG 4800 

TGCACAGATA CTCTTCAAGC ACTGGACGTG GATTCECTCT CTAGCCCTCA GCACCCC TGC 4860 

GGTAGCSACTG CCGCCTCTAC CCAOTGrCA TGGGGTACAG AGGCACTTGC TCTTCTGCAT 4920 

GGTGTTCAAT AGGCTGGGAG TTTTATTTAT CTCTTCAAAC TTrGTACAAG AGCTCATGGC 4980 

TTGTCTTCGG CTTTCGTCAT TAAACCAAAG GAAATGGAAG CCATTCCCCT GTTGCTCTCC 5040 

TTAGTCTTGG TCATCAGAAC CTCACITGGT ACCATATAGA TCAAAAGCTT TGTAACCACA 5100 

GGAAAAAATA AACTCTTCCA TCCCTTAAAG AATAGAATAG TTT6TCCCTC TCATGGGAAT 5160 

TGGGCTGTAT GTATATTGTT CriCCfCer i AGAATTTAGA GATACAAGAG TTCTACTTAG 5220 

AACrmCAT GGACACAATT TCCACAACCT TTCA6ATGCT GATGTAGAGC TATTGGGAAA 5280 

GAACTTCCAA ACTCAGGAAG TTTGCAGAGA G CAGAC AGCT AGAGATAACT CGGGACCCAG 5340 

AGTTGGTCGA CAGATGTTAG ATGTATCCTA GCTTTTAGCC ATAAACCACT CAAAGATTCA 5400 

GOCCCCAGAT CCCACAGTCA GAACTGAATC TGOGTTGTTG GGAAGCCAGC AGTGGCCTTG S460 

GGAAGGAAGC CATGGCTGTG GTTCAGAGAG GGTCGGCTGG CAAGCCACTT CCGGGGAAAA 5520 

CTCCTTCCGC CCCAfiGTTTC TTCTTCTCTT AAGGAGAGAT TGTTCTCACC AACCCGCTGC 5580 

CTTCATGCTG CCTTCAAAGC TAGATCATGT TTGCCTTGCT TA^^ 56" 

GCCCCAGTCC TTGGCGATCC ATTTACAGAT TTCTAGGCCC TCAGGGTTTT GTAGACIGTG 5700 

AGCCCTGGTG GGCAGGGTTC GGGGGTCTGT CTTCTGCTGG ATGCTGCTTG TAATCCATTT 5760 
GGTGTACAGA ATCAACAATA AATAATATAC ATGTAT 

Seq ID KO: 286 Protein sequence: 
Protein Accession ftt NP_S70843.l 

1 11 21 31 41 51 

MPLKHYLLLL VGCQAHGAGL AVHGCPSECT CSRASQVECT GARIVAVPTP LPWNAMSIXJI 60 

LNTHITELSIE SPPLNISALI ALRIEKMELS RITPGAFRNL GSLRYLSIAN NKLQVLPIGI, 120 

FDGIiDSLESL LLSSNQLLQI QPAHFSQCSN LKELQLHGNH LEYIPDGAFD HLVGLTKLNL 180 

GKNSLTHISP RVFQHLGMLQ VLRLYENRLT DIPKGTFDGL VNI^BLAIXJQ NQIGLLSPGIi 240 

FHNNHNLORL YLSNNHISQL PPSIFt^LPQ LNRLTLFGNS LKELSLGIFG PMPHLRELWI. 300 

YDNHISSLPD NVFSNLRQLQ VLILSRKQIS FISPGAPNGL TEURSLSJXT NALQDLDGNV 360 

PRMLANLQNI SLQNNRLRQL PGNIFANVNG LMAIQU3NNQ LENLPLGIFD HI/5KLCELRI. 420 

YDMPHRCDSD ILPLRWWLLL KQPRLGTDTV PVCFSPANVR GQSLIIINVN VAVPSVHVPE 480 

VPSYPETPWY PDTPSYPDTT SVSSTTELTS PVEDYTDLTT IQVTDORSVW GMTQAQSGLA 540 
lAAIVIGIVA LACSLAACVG CCOC3CKRS0A VLMQMKAPNE C 

Seq ID NO: 287 DMA sequence 
Nucleic Acid Accession »: NM_002362 
Coding sequence: 1..954 

1 11 . 21 31 41 51 

ATGTCTTCTG AGCAGAAGAG TCAGCACTGC AAGCCTGAGG AAGGGGTTGA GGOCC AAGAA 60 

GAGGCCCTGG GCCTGGTGGG TGCACAGGCT CCTACTACTG AGGAGCAQGA GGCTGCTGTC 120 

TCCTCCTCCT CTCCTCTGGT CCCTGGCACC CTGGAGGAAG TGCCTGCPGC TGAGTCAGCA 180 

GGTCCTCCCC AGAGTCCTCA GGGAGCCTCT GCCTTACCCA CTACCATCAG CTTCACTTGC 240 

TGGAGGCAAC CCAATGAGGG TTCCAGCAGC CAAGAAGAGG AGGGGCCAAG CACCTCGCCT 300 

GACGCAGAGT CCTTGTTCOG AGAAGCACTC AGTAACAAGG TGGATGAGTT GGCTCATTTT 360 

CTGCTCCGCA AGTATOGAGC CAAGGAGCTG GTCACAAAGG CAGAAATGCT OGAGAGAGTC 420 

ATCAAAAATT ACAAGOCCTG CTTTCCTGTG ATCTTCGGCA AAGCCTCCGA GTCCCTGAAG 480 

ATGATCTTTG GCATTGACXTT GAAGGAAGTG GACCCCGCCA GCAACACCTA CACCCTTGTC 540 

— — goo 

660 
720 



ACCTGCCTGG GCCTTTCCTA TGATGGCCTG CTGGGTAATA ATCAGATCTT TCCCAAGACA 
GGCCTTCTCA TAATCGTCCT GGGCACAATT GCAATGGAGG GOGACAGCGC CTCTGAGGAG 
GAAATCTGGG AGGAGCTGGG TGTGATGGGG GTGTATGATG GGAGGGAGCA CACTGTCTAT 
GGGGAGCCCA GGAAACTGCT CACCCAAGAT TGGGTGCAGG AAAACTACCT GGAGTACCGG 780 
CAGGTACCCG GCAGTAATCC TGCGOGCTAT GAGTTCCTGT GGGGTCCAAG GGCTCTGGCT 840 
GAAACCAGCT ATGTCAAAGT CCTGGAGCAT GTGGTCAGGG TCAATC3C»AG AGTTCGCATT 900 
GCCTACCCAT CCCTCCGTGA AGCAGCTTTG TTAGAGGAGG AA6AGGGAGT CTGA 



Seq ID NO: 288 Protein sequence: 
Protein Accession ft: NP_002353.1 

1 11 21 31 41 51 

M5SBQKSQHC KPEBGVEAQB EALGLVGAQA PTTEEQEAAV SSSSPLVPGT LEEVPAAESA 60 

GPPQSPQGAS ALPTTISFTC WRQPNBGSSS QEEEGPSTSP DAESIiFRBAL SNKVDELAHP 120 

LLRKYRAKEL VTKAEMLERV IKNYKRCPPV IPGKASESLK MIFGIDVKEV DPAS^JTYTLV 180 

TCLGLSYDGL LGNNQIFPKT GLIiIIVLGTI AMEGDSASBB BIMEEUSVMG VYDGREHTVY 240 

GEPRKLLTQD WVQENYLEYR QVPGSNPARY EFLWGPRAIA BTSYVKVLEH WRVNARVRI 300 
AYPSLREAAL LEEEEGV 

Seq ID NO: 289 DNA sequence 
Nucleic Acid Accession »: NM_002362 
Coding sequence: 46.. 1344 



1 . H 21 31 41 51 

CGGCGGCCGC GCCCTGGTTG GGTCCCCACT GCTCTCGGGG GCGCCATGGA CGAGGCCGTG 60 

GGCGACCTGA AGCAGGCGCT TCCCTGTGTG GCOGAGTCGC CAACGG TCCA CGTGGAGGTG 120 

CATCAGCGCG GCAGCAGCAC TGCAAAGAAA GAAGACATAA AOCTGAGTGT TAGA AAGCTA 180 

CTCAACAGAC ATAATATTGT GTTTGGTGAT TACACATGGA CTGAQTTTOA TQAACCTTTT 240 

TTGACCAGAA ATGTGCAGTC TGTGTCTATT ATTGACACAG AATT AAAGGT TAAAGACTCA 300 

CAGCCCATCG ATTTGAGTGC ATGCACTGTT GCACTTCACA TTTTCCAGCT GAATGAAGAT 360 

GGCCCCAGCA GTGAAAATCT GGAGGAAGAG ACAGAAAACA TAATTGCAGC AAATCACTGG 420 

GTTCTACCTG CRGCTGAATT CCATGGGCTT TGGGACAGCT TGGTATACGA TGTGGAAGTC 480 

AAATCCCATC TCCTOGATTA TGTGATGACA ACTTTACTGT TTTCftGACAA GAACGXCAAC 540 
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ACCAAOCTCA TCACCTGGAA UOJ U GTGGTG CTGCTCCAOG GT CCTC CTGG CACTGGAAAA 600 

ACATCCCTOT GTAAAGCGTT AGCCCAGAAA TTGACAATTA GACTTTCAAG CAGCTAOOGA 660 

TATGGCCAAT TAATTGAAAT AAACAGCCAC AfiCCTCTTTT CTAAGTGGTT TTOGQAAAGT 720 

GGCAACCIGG TAACCAAGAT GTTTC3VGAAG ATTCAGGATT TGATTGATGA TAAAGAOGCC 780 

C IX JU - I WUS TGCTGATTGA TCAGGTGGAG AGTCTCACAG COGCCCGAAA TGCCTGCAGG 940 

GOGGGCACCX; AGCCATCAGA TGCCATCOGC GTGGTCAATC CrGTCrTGAC CCAAATTGAT 900 

CAGATTAAAA GGCATTCCAA TGTTGTGATI CTGAOCACTT CTAACATCAC OGAGAAGATC 960 

GAOGTGGCCT TOGTGGACAG GGCTGACATC AAGCflGTACA TTQQGOCACC CTCTGCftGCA 1020 

GCCATCTTCA AAATCTACCT CTCTTGTTTG GAAGAACTGA TGAAGTCTCA GATCATATAC 1080 

CCTCGCCAGC AGCTGCTCAC CCTCCGAGAG CEAGAGATGA TTGGCTTCAT TGAAAACAAC 1140 

GTCTCAAAAT TCAGCCTTCT TTTGAATGAC ATTTCAAGGA AGAGOGAGGG CCTCAGCGGC 1200 

OGGGTCCTCA GAAAACTCCC CTTTCTGGCT CATGaSCTOT ATGTCCAGGC CCXXACOGTC 1260 

ACCATAGAGG GGTTCCtCCA GGCCCIGICT CTGGCAGTGG ACAAGCAGTT TGAAGAGAGA 1320 

AAGAAGCTTG CAGCPTACAT CTGATCCTGG GCTTCCCCAT CTGGTGCTTT TOCOlT^AG 1380 

AACACACAAC CAGTAAOTGA GGlTlGCCCXa^ CAC3«XXGTC TCCCAGGGAA TOSCTTCTGC 1440 

AAACCAAACG TTACTTAGAC TGCAAGCTAG AAAGOCACCA AGGCCAGGCT TTGTTAAAAC 1500 

AAGTGTATTC TATrTATCTT GTTTTAAAAT GCATACTGAG AGACAAACAT CTTGTCATTT 1560 

TCACTGTTTG TAAAAGATAA TTCAGATTGT TTGTCTCCTT GTGAAGAACC ATCGAAACCT 1620 

GTTTCTTCCC AGCCCACCCC CAGTGGATGG GATCCATAAT GCCAGCAAGT TTTOTTAAC 1680 

AGCAAAAAAG GAAGATTAAT GCAG6TGTTA TAGAAGCCAG AACAGAAACT GTGTCACCX:t 1740 

AAAGAAGCAT ATAATCATAG CATTAAAAAT GCACACRTTA CTOCAGGTGG AAOGTGGCAA 1800 

TTCCTTTCTG ATATCAGCTC GTTTGATTTA GTGCftAAAAT GTTTTCAAGA CTATTTAATG 1860 

GATCTAAAAA AGCCTAnTC TACATTATAC CAACTGAGAA AAAAATGGTC GGTAAAGTGT 1920 

TCTTTCATAA TAAATAATCA AGACATGGTC CGATTTGCAG GAAAAGTGCA GACTCTGPlGT 1980 

GTTCCAGGGA AACACATGCT GGACATCCCT TGTAACCCGG TATGGGCGCC CCTGCATTGC 2040 

TCGGATGTTT CXXXXXaCGG rmUH i 'G l - GCAATAACGT TATCACATTT CTAATGAGGA 2100 

TTCACATTAA TATAATATAA AATAAATAGG TCACTTACTG GTCTCTTTCT GCOGAATGTT 2160 
ATGTTTTGCT TTTATCTCAC AGTAAAATAA ATATAATTAA AAA 



Seq ID NO: 290 Protein sequence: 
Protein Accession ft: NP_004228 

1 11 21 31 41 SI 

ilDBAVGDLKQ ALPCVAESPT VHVEVKQRGS STAKKEDINL SVRKLLKRHN IVPGDYTWTE 60 

PDEPPLTRNV QSVSIIDTEL KVKDSQPIDL SACTVALHIF QLNEDGPSSE NLEEETENII 120 

AANHWVLPAA EFHGLWDSLV YDVEVKSHLL DYVMTTLLFS DKNVNSNLIT WNRWLLHGP 180 

PGTGKTSLCK ALAQKLTIRL SSRYRYGQLI EINSHSLFSK ttPSESGKLVT KMFQKI^I 240 

DDKDALVPVL IDEVESLTAA RNACRAGTEP SDAIRWMAV LTQIDQIKRH SNWILTTSN 300 

ITBKIDVAFV DRADIKQYIG PPSAAAIFKI YLSOiEELMK OQllYPRQQL LTLRELEMIG 360 

PIENNVSKLS LLUJDISRKS EGLSGRVLRK LPPLftHAIiYV QAPTVTIEGF LQALSUVVDK 420 
QFEERKXZiAA YI 



Seq ID NO: 291 DMA sequence 

nucleic Acid Accession ft: HM_0026S8.1 

Coding sequence: 77-1372 

1 11 21 31 41 51 

GTCOCCGCAG ixSCCGTCGCG CCCTCCTGCC GCAGGCCACC GAGGCCGCCG CCGTCTAGCX5 60 

CCCCGACCTC GCCACCATGA GAGCCCTGCT GGCGCGCCTG CTTCTCTGCG TCCTGGTCGT 120 

GAGCGACTCC AAAGGCAGCA ATGAACTTCA TCAAGTTCCA TCGAACTGTG ACTGTCTAAA 180 

TGGAGGAACA TGTGTGTCCA ACAACTACTT CTCCAACATT CACTGGTGCA ACTGCCCAAA 240 

GAAATTCGGA GGGCAGCACT GTGAAATACA TAAGTCAAAA ACCTGCTATG AGGGGAATGG 300 

TCACTTTTAC CGAGGAAAGG CCAGCACTGA CACCATGGGC CGGCCCTGCC TGCCCTGGAA 360 

CTCTGCCACT GTCCTTCAGC AAACGTACCA TGCCCACAGA TCTGATGCTC TTCAGCTGGG 420 

CCTOGOGAAA CATAATTACT GCAGGAACCC AGACAACCGG AGGCGACXXTT GGTGCTATGT 480 

GCAGGTGGGC CTAAAGCCGC TTGTCCAAGA GTGCATGGTG CATGACTGCG CAGATGGAAA 540 

AAAGCCCTCC TCTCCTCCAG AAGAATTAAA ATTTCAGTGT GGCX3UWUW5A CTCTGAG GCC 600 

CCGCTTTAAG ATTATTGGGG GAGAATTCAC CACCATCGAG AACCAGCCCT GGTTrGCGGC 660 

CATCTACAGG AGGCACCGGG GGGGCTCTGT CACCTAOGTG TGTGGAGGCA GCCPCATCAG 720 

CCCTTCCTGG GTGATCAGCG CCACACACTG CTTCATTGAT TACCCAAAGA AGGAGGACTA 780 

CATOGTCTAC CTGGGTCGCT CAAGGCTTAA CTCCAACACG CAAGGGGAGA TGAAGTTTGA 840 

GGTGGAAAAC CTCATCCTAC ACAAGGACTA CAGCGCTGAC ACGCTTGCTC ACCACAACGA 900 

CATTGCCTTC CTGAACATCC GTTCCAA6GA GGGCRGGTGT GCGCAGCCAT CCCGGACTAT 960 

ACAGACCATC TGCCTGCCCT CGATGTATAA OGATCCCCftG TTTGGCACAA GCTGTGAGAT 1020 

CACTGGCTTT GGAAAAGAGA ATTCTACCGA CTATCTCTAT CXX3GAGCAGC TGAAAATGAC 1080 

TCTTGTGAAG CTGATTTCCC ACCGGGAGTG TCAGCAGCCC CACTACTAOG GCTCT6AAGT 1140 

CACCACCAAA ATGCTATGTG CTGCTGACCC CCAATGGAAA ACAGATTOCT GCCAGGGAGA 1200 

CTCAGGGGGA CCCCTOGTCT GTTCCCTCCA AGGCOGCATG ACTTTGACTG GAATTGTGAG 1260 

CTGGGGCCGT GGATGTGCCC TCAAGGACAA GCCStfSGCGTC TACACX5AGAG TCTCACACTT 1320 

CTTACCCTGG ATCOGCAGTC ACACCAAG6A AGAGAATGGC CTGGCCCTCT GAGGGTCCCC 1380 

AGGGAGGAAA CX^GGCACCAC CCGCTTTCTT GCTGQTTGTC ATTTTTGCAG TAGAGTCATC 1440 

TCCATCAGCT GTAAGAAGAG ACPGGGAAGA TAQGCTCTGC ACAGATGGAT TTGCCTGTGG 1500 

CACCACCAGG GTCAACGACA ATAGCTTTAC CCTCAOGGAT AGGCCTGGGT GCTGGCTGCC 1560 

CAGACCCTCT GGCCAGGATG GAGGGGTGGT CCTGACTCAA CATGTTACTG ACCAGCAACT 1620 

TGTCTTTTTC TGGACTGAAG CCTGCAGGAG TTAAAAAGGG CAGGGCATCT CCTGTGCATG 1680 

GGCTCGAACG GAGAGCCAGC TCCCOOGAOC GGTGGGCATT TGTGAGGCCC ATGGTTGAGA 1740 

AATGAATAAT TTCCCAATTA GGAAGTGTAA GCAGCTGAGG TCTCTTGAGG GAGCTTAGCC 1800 

AATGTGGGAG CAGCGGTTTG GGGAGCAGAG ACACTAAOGA CTTCAGGGCA GGGCTCTGA T 1860 

ATTCCATGAA TGTATCAGGA AATATATATG TGTGTGTATG TTTCCACACT TGTTCTGTGG 1920 

GCTGTGAGTG TAAGTCTGAG TAAGAGCTCG TGTCTGATTG TTAAGTCTAA ATATTTCCTT 1980 

AAACTCTGTG GACTGTGATG CCRCACAGAG TGGTCTTTCT GGAGAGGTTA TAGGTCACTC 2040 

CTGGGGCCTC TTGQGTCCOC CAOGTGACAG TGCCTGGGAA TGTACTTATT CTGCMSCA'CG 2100 

ACCIGTGACC AGCACXGTCT CAGTTTCACT TTCACATAGA TGTCCCTTTC TTGGCCAGTT 2160 

ATCCCTTCCT TTTAGCCTAG TTCATCCAAT CCTCACIGGG TGGG GTGAGG ACCACTCCTT 2220 

ACACTGAATA TTTATATTTC ACTATTTTTA TTTATATTTT TGTAATTTTA AATAAAAGTG 2280 
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ATCAATAAAA TGTGATTTTT CTGA 

Seq ID KO: 292 Protein sequence: 
Protein Accession §:NP_002649.1 

1 11 21 31 |1 SI 

iLaLLAIOLI. CVLWSDSKG SNELHOVPSN CDCUTOGTCV StnWFSNIHW aiCPKKFGGO 60 

HCEIDKSKTC YEOfGHFYRG KASTDTMGRP CLPWNSATVL QQTyHAHRSD ALQI/3LGKHN 120 

YCRNPDURRR PWCYVQVGUC PLVQECMVHD CADGKKPSSP PBELKFQCGQ KTLRPRFKII 180 

GGBPTTIENQ PWFAAIYRRH RGGSVTYVOG GSLISPCWVI SATHCFIOYP KKEDYIVYLG 240 

RSRLNSNTQG EMKPEVENLI LHKDYSADTL AHHNDIALLK IRSKEtaCAQ PSRTIQTIO. 300 

PSMYNDPQFG TSCEITGFGK EWSTDYLYPE QLKKTWKbl SHRECQQPHV 1«3SEVTTO& 360 

CAADPQWKTD SCQGDSGGPL VCSIXJCSMTL TGIVSWGRGC AUCDKPGVYT RVSHPtPMIR 420 
SHTKEENGUV L 

Seq ID NO: 293 DNA sequence 
Hucleic Acid Accession ft: NM_001498 
Coding sequence: 93.. 2006 

1 11 21 31 41 51 

GGCAOGAGOC TGAGTGTCCG TCTCGCGCCC GGAAGCGGGC GACCGCOGTC AGCXXXJCTGG 60 

aSaSa^ GGAGGACGAG GAGGGGGCGG CCATGGGGCT GCTOTCCCAG GGCTCGOCGC 120 

TG^CTGGGA GGAAACCyVAG CGCCATGCCG ACCACGTGCG GCGGCACGGG ATCCTCCAGT 180 

TCCTCCACAT CTACCACGCC GTCAAG6ACC GGCACAAGGA CGTTCTCAAC TGGGGOGATG 240 

AGGTCGAATA CATGTTGGTA TCTTTTGATC ATGAAAATAA AAAAGTCCGG TTGGTOCTGT 300 

CTGGGGAGAA AGTTCTTGAA ACTCTGCAAG AGAAGGGGGA AAGGACAAAC CCAAACOTC 360 

CTACCCTTTG GAGACCAGAG TATGGGAGTT ACATGATTGA AGGGACACXa^ GGACAGCCCT 420 

AC3GGAGGAAC AATGTCCGAG TTCAATACAG TTGAGGCCAA CATGCGAAAA CGCOGGAAGG 480 

AGGCT^C TATATTAGAA GAAAATCAGG CTCTTTGCAC AATAACTTCA TTTCCCAGAT 540 

TAGGCTCTCC TGGGTTCACA CTGCCCGAGG TCAAACCCAA CCCAGTGGAA GGAGGAGCTT 600 

CCAAGTCCCT CTTCTTTCCA GATGAAGCAA TAAACAAGCA CCCTGGCTTC AGTACCTTAA 660 

CAAGAAATAT CCGACATAGG AGAGGAGAAA AGGTTGTCAT CAATGTACCA ATATTTAACG 720 

ACAAGAATAC ACCATCTCCA TTTATAGAAA CATTTACTGA GGATGATGAA GCTTCAAGGG 780 

CTTCTAAGCC GGATCATATT TACATGGATG CCATGGGATT TGGAATGGGC AATTGCTGTC 840 

TCCAGGTGAC ATTCCAAGCC TGCAGTATAT CTGAGGCCAG ATACCTTTAT GATCAGTTGG 900 

CTACTATCTG TCCAATTGTT ATCGCTTTGA GTGCTGCATC TCCCTTTTAC CX5RGGCTATG 960 

TGtSgACAT TGATTGTCGC TGGGGAGTGA TTTCTGCATC TGTAGATGAT AGAACTOGGG 1020 

AGGAGCGAGG ACTGGAGCCA TTGAAGAACA ATAACTATAG GATCAGTAAA TC^TATG 1080 

ACTCAATAGA CAGCTATTTA TCTAAGTGTG GTGAGAAATA TAATGACATC GACTTGACGA 1140 

TAGATAAAGA GATCTACGAA CAGCTGTTGC AGGAAGGCAT TGATCATCTC CTGGCCCAGC 1200 

ATGTTGCTCA TCTCTTTATT AGAGACCCAC TGACACTGTT TGAAGAGAAA ATACACCTGG 1260 

ATGATGCTAA TGACTCTGAC CATTTTGAGA ATATTCAGTC CACAAATTGG CAGACAATGA 1320 

GATTTAAGCC CCCTCCTCCA AACTCAGACA TTGGATGGAG AGTAGAATTT OGACCCATGG 1380 

AGGTGCAATT AACAGACTTT GAGAACTCTG CCTATGTGGT GTTTGTGGTA CTGCTCACCA 1440 

GAGTGATCCT TTCCTACAAA TTGGATTTTC TCATTCCACT GTCAAAGGTT GATGAGAACA 1500 

TCAAGGTAGC ACAGAAAAGA GATGCTGTCT TGCAGGGAAT GTTTTATTTC AGGAAAGATA 1560 

TTTGCAAAGG TGGCAATGCA GTGGTGGATG GTTGTGGCAA GGCCCAGAAC AGCAOGGAGC 1620 

TOGCTGCAGA GGAGTACACC CTCATGAGCA TAGACACCAT CATC3WITGGG AAGGAAGGTG 1680 

TGTTTCCTGG ACTGATCCCA ATTCTGAACT CTTACCTTGA AAACATGGAA GTGGATGTGG 1740 

ACACCAGATG TAGTATTCTG AACTACCTAA AGCTAATTAA GAAGAGAGCA TCTGGAGAAC 1800 

TAATGACAGT TGCCAGATGG ATGAGGGAGT TTATCGCAAA CCATCCTGAC TACAAGCAAG 1860 

ACAGTGTCAT AACTGATGAA ATGAATTATA GCCTTATITT GAAGTGTAAC CAAATTGCAA 1920 

ATGAATTATG TGAATGCCCA GAGTTACTTG GATCAGCATT TAGGAAAGTA AAATATAGTG 1980 

GAA(3TAAAAC TCACTCATCC AACTAGACAT TCTACAGAAA GAAAAATGCA TTATTGACGA 2040 

ACTGGCTACA CTTACCATGCC TCTCAGCCCG TGTGTATAAT ATGAAGACCA AATGATAGAA 2100 

CTGTACTGTT TTCTGGGCCA GTGAGCCAGA AATTGATTAA GGCITTCrrT GGTAG GTAAA 2160 

TCTAGAGTTT ATACAGTGTA CATGTACATA GTAAAGTATT TTTGATTAAC AATGTATTTT 2220 

AATAACATAT CTAAAGTCAT CATGAACTGG CTrGTACATT TTTAAATTCT TACTCTGGAG 2280 

CAACCTACTG TCTAAGCAGT TTTGTAAATG TACTGGTAAT TGTACAATAC TTGCATTCCA 2340 

GAGTTAAAAT GTTTACTGTA AATTTTTGTT CTTTTAAAGA CTACCTGGGA CCTGATTTAT 2400 

TGAAATTTTT CTCTTTAAAA ACATTTTCTC TCGTTAATTT TCCTTTGTCA TTTCCTTTGT 2460 

TGTCTACATT AAATCACTTG AATCCATTGA AAGTCCTTCA AGGGTAATCT TO OGr TTCTA 2520 

GCACCTTATC TATGATGTTT CTTTTGCAAT TGGAATAATC ACTIGGTCAC CTTGCXXrCAA 2580 
GCTTTCCCCT CTGAATAAAT ACXX31TTGAA CTCTGAAAAA AAAAAAAAAA AAAA 



Seq ID NO: 294 Proteitt sequence: 
Protein Accession »s NP_001489 

1 11 21 31 41 51 

itGLLSQGSPL LeCTKRHAD HVRRHGIIiQP LHIYHAVKDR HKDVUWGDE VEYMLVSFDH 60 

ENKKWRLVLS GEKVLETLQE KGERTNPNKP TLWRPSYGSY MIBGTPGQPY GGTMSEFNTV 120 

EANMRKRRKE ATSILEENQA UnTTSPPRL GCPGPTI.PEV KPNPVEX3GAS KSLFPPDEAI 180 

NiCHPRPSTLT RNIRHRRGEK WXNVPIFKD KNTPSPPIET FTEDDEASRA SKPDEXYMDA 240 

MGFa<GNCCri QVTPQACSIS EARYIiYDQLA TICPIVMALS AASPFYRGYV SDIDCRWGVI 300 

SASVDDRTRE ERGLEPLKNN NYRISKSHYD SIDSYLSKCG EKYNDXOLTI OKBZYEQX.LQ 360 

E6IDHLLAQH VAHLFIRDPL TLFEEKIHLD DANESDHFEN IQSTNWQTMR PKPPPPNSOI 420 

GWRVEPRPME VQLTDFEMSA YWFWLLTR VILSYKLDFL IPLSKVDENM KVAQKRDAVL 480 

QGMFYFRKDI CKGGNAWDG CGKAQNSTEL AAEEYTLMSl DTIINGKEGV PPGLIPILNS 540 

YLENMEVDVD TRCSILUYLK LIKKRASGBL MTVARWHREF lANHPDYKQD SVITDEMNYS 600 
LILKCNQIAN ELCECPBIiLG SAFRKVKYSG SKTDSSS 



296 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

Seq lO HO: 29S DMA sequence 

Nucleic Acid Accession S: Eos sequence 

Coding sequence: 247-816 



PCT/US02/12476 



1 
I 

AGT6TT0GGC 
GGCCAAAOQG 
CCTAGGGC5GC 
GGGAGGOGCC 
GAAACAATGA 
CXCAGGGAAT 
AAACAAGGAG 
ATGACAGGAC 
TTCAGCAAAG 
ACCAGCAGTT 
CAACGAGAAA 
CAAAAATATX3 
AAGCGATTTT 
AAGCACCTTA 
CACACCCCAA 
TTCTACAATG 
CTTCCAGAGG 
TTGAAAGGAT 
TAACCCATTA 



11 

i 

TGGGGCAGGC 
GATOGGTGCT 
ACATTTCCCA 
ACAACTTCAC 
CCGATAAAAC 
GTGAC3VGTCC 
CAGGAQACAG 
ATGCTATTCC 
ATAGGATGAT 
TCTCTGGAGA 
TTAATGCTGA 
AAAAAATCTT 
TTGAATCCAT 
AGAAGAAACT 
ATGCATAATC 
GAGCAGGATA 
CTAAGAAATT 
AACTTGTGTT 
GGTAAATACr 



21 
I 

AOGCTGTGGC 
TCTGGTGAGA 
C3UVCT CCCAG 
TGCCATTTTG 
AGAGAAGGTG 
TTOGTATCAG 
CCTTATTGCA 
ACCCAGCCAA 
GCAGAAACCT 
TGACCTAGAA 
TATAAAACGT 
OGAAATGCTT 
CATCAAGGAA 
GAAACXTTATG 
TCGTTAATGA 
TTGCTGAAGT 
TCTGTTAGTA 
TTGGTTATTT 
ATTACAGTCG 



31 
1 

TGGCTACTTC 
GGCCrCCCCA 
AGG6CAGGTT 
TGAGGTGCCG 
GCTGTAGATC 
AAAAGGCAGA 
GGCTCTGCCA 
TTGGATTCTC 
GGTAGCAATG 
TGCAGAGAAA 
AAATTAGTGA 
GAAGGAGTGC 
GCAGCAAGAT 
ATTTGAGAAT 
TTGAGGAGAG 
CTCCTGGCAT 
AAAGATGTPC 
TGTATTCCCA 
TGGTTTCTGC 



41 
I 

CCTTCCrCCC 
TGCACATCAC 
TCTAGAAAGT 
CCX3TCTCTCC 
CTGAAACTGT 
GGATGGCCCT 
TGTCCAAAGA 
AGATTGATGA 
CACCTGTGGG 
CAGCCTCCTC 
AGGAACTCCG 
AAGGACCTAC 
GTATGAGACG 
ACTTGTCCCT 
AAAAGGATCA 
ATGTTACOGA 
TTTTTCCCAA 
CCTGTGCTGG 
A 



SI 

I 

ATCCCOCTTG 
T0C3CAGGTGC 
GCCACCACTG 
TCCAGCAAGG 
GTTTAAAOGT 
GTTGGCAAGG 
AAAGAAGCTT 
CTTCACTGGT 
AGGAAAOGTT 
TCOCAAAAGC 
ATGOGTTGGA 
T6CAGTCA0G 
AGACTTTGTT 
GGAGGATTAT 
GATTGCTGIT 
ATCAAATAGC 
AGCATTTTAT 
TAGATATTAT 



Seq ID MO: 296 Protein sequence: 
Protein Accession 8: Eos sequence 



51 



1 11 21 31 41 

1 I 1 1 i t - 

WTDKTEKVAV DPETVPKRPR ECDSPSYQKR QRMALLARKQ GA6DSLIAGS AMSKKKKLKT 
GHAIPPSQIiD SQIDDPTGPS KDRMKQKPGS NAPVGGNVTS SFSGDDLECR ETASSPKSQR 
EINADIKRKL VKEI*RCVGQK YEKIFEMLEG VCX3PTAVRKR FFESIIKEAA RCMRHDFVKH 
UCKKLKSMI 

Seq ID NO: 297 DNA sequence 

Nucleic Acid Accession S: Eos sequence 

Coding sequence: 247-815 



AGTGTTCGGC 
G6CCAAACX3G 
CCTAGGGGGC 
GGGAGGOGCC 
GAAACAATGA 
CCCAGGGAAT 
AAACAAGGAG 
ATGACAGGAC 
TTCAGCAAAG 
ACCAGCAGTT 
CAACAAGAAA 
CAAAAATATG 
AAACGATTTT 
AAGCACCTTA 
CACACCCCAA 
TTCTACAATG 
CTTCCAGAGG 
TTGAAAGGAT 
TAACCCATTA 



11 
I 

TGGGGCAGGC 
GATCGGTGCT 
ACATTTCCCA 
ACAACTTCAC 
CCGATAAAAC 
GTGACAGTCC 
CAGGAGACAG 
ATGCTATTCC 
ATAGGATGAT 
TCTCTGGAGA 
TTAATGCTGA 
AAAAAATCTT 
TTGAATCCAT 
AGAAGAAACT 
ATGCATAATC 
GAGCAGGATA 
CTAAGAAATT 
AA CT T G TGTT 
GGTAAATACT 



21 
I 

ACGCTGTGGC 
TCTGGTGAGA 
CAACTCCCAG 
TGCCATTTTG 
AGAGAAGGTG 
TTCGTATCAG 
CCTTATTGCA 
ACCCAGCCAA 
GCAGAAACCT 
TGACCTAGAA 
TATAAAACGT 
CGAAATGCTT 
CATCAAGGAA 
GAAACGTATG 
TCATTAATGA 
TTGCTGAAGT 
TCTGTTAGTA 
TTGGTTATTT 
ATTACAGTCG 



31 
I 

TGGCTACTTC 
CGCCTCCCCA 
AGGGCAGGTT 
7GA0GTGG06 
GCTGTAGATC 
AAAAGGCAGA 
GGCTCTGCCA 
TTGGATTCTC 
GGTAGCAATG 
TGCAGAGAAA 
AAATTAGTGA 
GAAGGAGTGC 
GCAGCAAGAT 
ATTTGAGAAT 
TTGAGGAGAG 
CTCCTGGCAT 
AAAGATGTTC 
TGTATTCCCA 
TGGTTTCTGC 



41 
I 

CCTTCCTCCC 
TGCACATCAC 
TCTAGAAAGT 
CCGTCTCTCC 
CTGAAACTGT 
GGATGGCCCT 
TGTCCAAAGA 
AGATTGATGA 
CACCTGTGGG 
CAGCCTCCTC 
AGGAACTCCG 
AAGGACCTAC 
GTATGAGACG 
ACTTGTCCCT 
AAAAGGATCA 
ATGTTACOGA 
TTTTTCCCAA 
CCTGTGCTGG 
A 



51 
I 

ATCCCCCTTG 
TCCCAGGTGC 
GCCACCAGTG 
TCCAGCAAGG 
GTTTAAAG6T 
GTTGGCAAGG 
AAAGAAGCTT 
CTTCACTGGT 
AGGAAACGTT 
TCCCAAAAGC 
ATGOGTTGGA 
TGC AGTCAGG 
AGACTTTGTT 
GGAGGATTAT 
GATTGCTGTT 
ATCAACTGGC 
AGCGTTTTAT 
TAGATATTAT 



Seq ID NO: 298 Protein sequence: 
Protein Accession fli Eos sequence 



11 



21 



31 41 51 

I I I 

MTDKTEKVAV DPETVPKRPR ECDSPSYQKR QRMAUiARKQ GAGDSLIA6S AHSKEKKLMT 
GHAIPPSQLD SQIDDFTGFS KDRKMQKPGS NAPVGGNVTS SFSGDDZ^CR BTASSPXSQQ 
EINADIKRKL VKEUtCVGQK YEKIFEMLEG VQCPTAVRKR FFESIIKEAA RCMRRDFVKH 
LKKKLKRMI 

Seq ID MO: 299 DNA sequence 

Nucleic Acid Accession Eos sequence 

Coding sequence: 247-815 



AGTGTTCGGC 
GGCCAAACGG 
CCTAGGGGGC 
GGGAGGCGCC 
GAAACAATGA 
CCCAGGGAAT 
AAACAAGGAG 
TGACAG6ACA 



11 
I 

TGGGGCAGGC 
GATCGGTGCT 
ACATTTCCCA 
ACAACTTCAC 
CCGATAAAAC 
GTGACAGTCC 
CAGGAGACAG 
T6CTATTCCA 



21 
I 

AOGCTGTGGC 
TCTGGTGAGA 
CAACTCCCAG 
TGCCATTTTG 
AGAGAAGGTG 
TTOGTATCAG 
CCTTATTGCA 
CCCA60CAAT 



31 
1 

TGGCTACTTC 
CGCCTCCCCA 
AGGGCAGGTT 
TGAGGTGCCG 
GCTGTAGATC 
AAAAGGCAGA 
GGCTCTGCCA 
TGGATTCTCA 



41 
I 

CCTTCCTCCC 
TGCACATCAC 
TCTAGAAAGT 
CCGTCTCTCC 
CTGAAACTGT 
GGATGGCCCT 
TGTCCAAAGC 
GATTGATQAC 



60 
120 
180 

240 
300 
360 
420 
4B0 
540 
600 
€60 
720 
780 
840 
900 
960 
1020 
1080 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



60 
120 
180 



51 
1 

ATCCCCCTTG 
TCCCAGGTGC 
GCCACCAGTG 
TCCAGCAAGG 
GTTTAAAOGT 
GTTGGCAAGG 
AAAGAGCTTA 
TTCACTGGTT 



60 
120 
180 
240 
300 
360 
420 
480 



297 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 




WO 02/086443 

TCACCAAftGA TABQATGAKi CAGAAACCTG 
CCAGCAGTTT CTCTGGAGAT GACCTAGAAT 
AACMGAMT TAATGCTGAT ATAAAAOSTA 
AAAAATATCA AAAAATCTTC GAAATGCTTG 
AftOGATTTTT TCAATOCAIC AICAAGGAAG 
ASCACCTTAA GAAGAAACTG AAACGTATGA 
ACAOOCCAAA TGCATAATCT CAITAATSAT 
TCTACAATQS AGCAGGATAT TGCIGRAGTC 
TTOOIGAGGC TAAGAAATTT CTGTTAGTAA 
T6AAM3GAIA ACTTBTCTTT lOSTT ATTTT 
AACnCKTTAG OTAAATACTA TTAOUSTOGT 

S«q ID tlOi 300 Protein ae^eneei 
Protein Accession it Boa ae^enee 

t V r r 1^ r 

MTDKTEKVAV DPETVFKRPR ECDSPSYQKR QRMALIiARKQ GAGDSLIAfiS AMSKAKKLMT 
GHAIPPSQLD SQIDDFTGPS KDHMMQKPGS NAPVOGUVTS SPSCTDLECR BTASSPKSQQ 
EINADIKRKL VKEbRCVGQK YEKIPEMLEG VQGPTAVRKR PFBSIIKEAA ROIRRDPVKH 
LKXXLKRMI 

Seq ID NO: 301 DMA sequence 

Nucleic Acid Accession S; Eos seqiiezice 

Coding sequence: 247-812 



PCTAJS02/12476 



1 
I 

AGTGTTCGGC 
GGCCAAAOGG 
CCTAGGGGGC 
GG6AGGCGCC 
GAAACAATGA 
COCAOGGAAT 
AAAC2UVGC3AG 
T6ACAGGACA 
TCAGCAAAGA 
CCAGCAATTT 
AACAAGAAAT 
AATATGAAAA 
GATTTTTTGA 
ACCTTAAGAA 
CCCCAAATGC 
ACAATGGAGC 
CAGAGGCTAA 
AAGGATAACT 
CCATTAGGTA 



I 

TGGGGCAGGC 
GATCGGTGCT 
ACATTTCCCA 
ACAACTTCAC 
CCGATAAAAC 
GTGACAGTCC 
CAGGAGACAfi 
TGCTATTGCA 
TGGGATQATG 
CTCTGGAGAT 
TAATGCTGAT 
AATCTTOGAA 
ATCCATCATC 
GAAACTGAAA 
ATAATCTCAT 
AGGATATTGC 
GAAATTTCTG 
TGTGTTTTGG 
AATACTATTA 



21 
I 

ACGCTGTGGC 
TCTGGTCAiQA 
CAACTCCCAG 
TGCCATTTTG 
AGAGAAGGTG 
TTCGTATCAG 
CCTTATTGCA 
CCCAGCCAAT 
CAGAAACCIG 
GACCTAGAAT 
ATAAAATGTC 
ATGCTTGAAG 
AAGGAAGCAG 
CGTATGATTT 
TAATGATTGA 
TGAAGTCTCC 
TTAGTAAAAG 
TTATTTTGTA 
CAGTCGTGGT 



31 

i 

TGGCTACTTC 
OGCCTCCCCA 
AGGGCAGGTT 
TGAGGTGCOG 
GCTGTAGATC 
AAAAGGCAGA 
GGCTCTGCCA 
TGGATTCrCA 
GTAGCAATGC 
GCAGAGGAAT 
AAGTAGTGAA 
GAGTGCAAGG 
CAAGATGTAT 
GAGAATACTT 
GGAGAGAAAA 
TGGCATATGT 
ATGTTCTTTT 
TTCCCACCTG 
TTCTGCA 



41 

i 

CCTTCCTCCC 
TGCACATCAC 
TCTAGAAA6T 
CCGTCTCTCC 
CTGAAACTGT 
GGATGGCCCT 
TGTCCAAAGA 
GATTGATGAC 
ACCTGTGGGA 
AGOCTCCTCT 
GGAAATCCGA 
ACCTACTGCA 
GAGACGAGAC 
GTCCCTGGAG 
GGATCAGATT 
TAOCQAATCA 
TCCCAAAGC6 
TGCTGGTAGA 



51 
I 

ATCCCCCTTG 
TCCCAGGTGC 
GCCACCAGTG 
TCCAGCAAGG 
GTTTAAAGGT 
GTTGGCAAGG 
AAAGAGCTTA 
TTCACTGGTT 
GGAAATGTTA 
CCCAAAAGCC 
TGCCTTGGAC 
GTCAGGAAAC 
TTTGTTAAGC 
GATTATCACA 
GCTGTTTTCT 
ACTGGCCTTC 
TTTTATTTGA 
TATTATTAAC 



Seq ID NO: 302 Protein sequence: 
Protein Acceasion ft: Eos sequence 



11 



21 
I 



31 



51 



MTDKTEKVAV DPETVFKRPR ECDSPSYQKR QRMALLARKQ GAGDSLIAGS AMSKEKKLMT 
GHAIPPSQLD SQIDDFTGPS KDGMMQKPGS NAPVGGHVTS HFSCTOLBCR GIASSPKSQQ 
EINADIKCQV VKEIRCLGQY EKIPEMLEGV QGPTAVRKRF FESIIKEAAR CMRRDFVKHL 
KKKLKRMI 

Seq ID NO: 303 OKA sequence 

Nucleic Acid Accession Us Bos sequence 

Coding sequence: 247-815 



1 
i 

AGTGTTCGGC 
GGCCAAACAG 
CCTAGGGGGC 
GGGAG6CGGC 
GAAACAATGA 
CCCAGGGAAT 
AAACAAGGAG 
TGACAGGACA 
TCAGCAAAGA 
CCAGCAGTTT 
AACAAGAAAT 
AAAAATATGA 
AACGATTTTT 
AGCACCTTAA 
ACACCCCAAA 
TCTACAATGG 
TTCCAGAGGC 
T6AAAGGATA 
AACCCATTAG 



11 
I 

TGGGACAGGC 
GATCGGTGCT 
ACATTTCCCA 
ACAACTTCAC 
CCGATAAAAC 
GTGACAGTCC 
CAGGAGACAG 
TGCTATTCCA 
TAGGATGATG 
CTCTGGAGAT 
TAATGCTGAT 
AAAAATCTTC 
TGAATCCATC 
GAAGAAACTG 
TGCATAATCT 
AGCAGGATAT 
TAAGAAATTT 
ACTTGTGTTT 
GTAAATACTA 



21 
I 

ACGCTGTGGC 
TCTGGTGAGA 
CAACT CCCAG 
TGCCATTTTG 
AGAGAAGGTG 
TTCGTATCAG 
CCTTATTGCA 
CCCAGCCAAT 
CAGAAACCTG 
GACCTAGAAT 
ATAAAAOGTA 
GAAATGCTTG 
ATCAAGGAAG 
AAACGTATGA 
CGTTAATGAT 
TGCTGAAGTC 
CTGTTAGTAA 
TGGTTATTTT 
TTACAGTCGT 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



60 
120 
180 




Seq ID NO: 304 Protein sequence: 
Protein Accession ft: Eos sequence 



298 



wo 02/086443 

1 11 21 31 41 51 

MTDKTEKVAV DPETVFKRPR ECUSPSYQKR QRMALLARKQ GAC33SLIAGS AMSKRKKUff 
CTAIPPSQU) SOIDDPTGPS KDSMMQKPGS NAPVGGNVTS SFSGDELECR ETASSPKSQQ 
EIUADIKHKL VKELRCVOQK YEICIPEMLBG VQGPTAVRKR FFBSIIKEAA RCMRBDiVKH 
LKKKLKSMI 

Seq ID NO: 305 DNA sequence 

Nucleic Acid Accession B: Eos sequence 

Cbding sequence: 67-689 



PCT/US02/12476 



60 
120 
180 



1 
1 

CGTGGAGGCA 
CCAGACTAGC 
AGATGTCOGC 
CAGAGGTCCC 
TGTCCGGGAA 
ATCGGGAAAT 
CTCCCAAAAG 
AATCCACAAA 
ATAATTTAAA 
AGTATGAGAA 
CTGCTAAAGT 
AGGAG6AGGA 
TTAGAGTAGG 
ATTAGGTTTA 
AATTGTCAGT 
AACTTGTACA 
CTGTGCACTT 
ATTTGTAAGG 
TATCPATAGT 
GCGTTGAGGC 
GAGGCTGGAC 
GTATATAGTG 
CATGAGAATA 
TATA6AACTC 
CTCCTGrPACr 
GAAATGTTTT 
AGTCAATTTC 
CTCCCTATAA 
TGAAGGAGAG 
AT6AAGTCTG 
AGG AAGGT GG 
CCTATTTT6T 
AAATTAAGGC 
ACATTATTTG 
AGTGAGGGTA 
TTGGAAACAC 
AGGGCAGGCr 
GCCTGCTCAT 
GAAGGAGCTT 
GTTGGGGTGA 
CTGATGTGTA 
GTGAGTGTTG 
CGCAGGAGTG 
GTTGAGAAAC 
TACX5AGTTAT 
CATCAGAACT 
AAAACTCGGT 
GCCATGTCCT 
CTAGGCCAAG 
GGTCAAAAGG 
CAGGGAAGGG 
GTGGGGGAGC 
TGTCAAGTTG 
CACACTGTGG 
AAGTGGCATG 
TPCTTAAGAT 
CTACrCCCTC 
AGGCTGTGAT 
GTATTTGGGG 



11 



10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



Seq ID NO: 307 SNA sequence 
85 Nucleic Acid Accession 8.- NM_022342 
Cbding sequence*. 1..2178 



GCTAGCGOGA 
GAACAATACA 
TTATGCCTTC 
TGTCAATTTT 
AGAGAAATCT 
GAAGGATTAT 
GCCACCGTCT 
CCCCGGCATC 
TGACAGTGAA 
GGATGTTGCT 
T6CCCGGAAA 
GGAGGAGGAG 
GGAGCGCOGT 
ATTACAAAAT 
GGTTTACATG 
TATTTCCAAA 
TCCTGTT6GT 
TGGTGGTAAC 
TTGTAAAAAG 
TGTGGGGAAG 
CTGTTGACTC 
ACATAGCATT 
TTTTTTTTTT 

ttcattgtca 
taaacaosat 
tgaagttaaa 
tgactcacag 
atgtggtagc 

GGCTACTTGA 
GAGGAGTTAG 
GIGATTAGGA 
GGGGCCAAAT 

CTTATTGTTT 

tggtgccx:aa 
tgtgggatgg 
caaacacccc 
aatggaatca 

AAGTTTAGCT 
GGTTTGTGTG 
GGGGAGATGG 
TATACATCAT 
CTATTGCCCA 
TTTTTGTGCT 
TTGCATGTCT 
GGTCACGGTC 
GTTTGTCCTG 
TGTGAGGTTT 
TGTCACTTGG 
ATTCGGGAGC 
GAGTGATTTG 
GCAAGGATGG 
AGTTTAGCCA 
AGGCCACTTG 
CAAGATTGCT 
ATGTTACCTA 
GCCAACCTGT 
TAACCACCTC 
GTAGT6ACTA 
AC6TTGGATG 



21 

GGCTGGGGAG 
GTCAGGATGG 
TTTGTGCAGA 
GCGGAATTTT 
AAATTIGATG 
GGAGCAGCTA 
GGATTCTTCC 
TCTATTGGAG 
AAGCAGGCTT 
GACTATAAGT 
AAGGTGGAAG 
GATGAATAAA 
AATTGACACA 
TTGATCACGA 
AAGTGGCCAT 
CATTTTTAAA 
GTGACAAGGC 
TATGGTTATT 
AACA AAACAA 
ATGCCTTTTG 
TGCAGGGGGC 
CTGCTGCOVT 
TAAGTGCX3GT 
GCAAAGCAAA 
TCGCAACGTT 
TAAACAGTAT 
CAGT6AACAA 
TTCTTTTATT 
AGCTACTGTG 
GAGAAGGACA 
CTGAGGCTAT 
GCATTGCTAA 
TTCTCTTTCA 
CATTTGGGGT 
GGTGGTGGGG 
AAGGAAGATG 
ACCATTTCTG 
CATTCACTGG 
TCAGTGGTTA 
CCACAGTAGC 
TACTGTCCGT 
GCATTAATAT 
ATTAATTTTA 
GGAGGCGGTG 
ACAGCCTGAT 
AATGTGTTCC 
GCCCAGAGGC 
CATTCTAAGC 
TGTTGCCAGC 
TTAAGTGGTG 
AAAGGGGTAA 
GATGATCTTT 
GTCCATTAGC 
CTTCTAGTGG 
AGGCTTAGGC 
TGCTTTTTTT 
ACCCCATTCT 
TTGTCTGTGT 
CATTCATTTT 



31 41 

I I 

OGCTGAGCOG CGC6T0GTGC 
CTAAAGGTGA CCCCAAGAAA 
CATGCAGAGA AGAACATAAG 
CCAAGAAGTG CTCTGAGAGG 
AAATGGCAAA GGCAGATAAA 
AGGGAGGCAA GAAGAAGAAG 
TGTTCTGTTC AGAA TTCOG C 
ACGTGGCAAA AAAGCTG66T 
ACATCACTAA GGCG6CAAAG 
CGAAAGGAAA GTrTGATGGT 
AGGAAGATGA AGAAGAGGAG 
GAAACTGTTT ATCTGTCTCC 
TCTCTTATTT GAGAAGTGTC 
TCATATTGTA GTCTCTCAAA 
GGGTGTCTGG AGCACCCTGA 
ATGAAAAGGC ACTCTCGrGT 
ATTTAAAGAT GTTTCTGGCA 
GGCTAGAAAT OCTGAGTTTT 
CCGAGACAAA OCCTTGATGC 
GGAGAGGCTG TAGCT CAGGG 
ATCCATTTAG CTTCAGGTTG 
CrrAGCTGTG GACAAAGGGG 
AGTTTTTAAA CTGTTTGTTT 
GAGTCACTGC ATCAATGAAA 
CTGTTATTTT TTTTGTATGT 
TACATTTTTA AAACTCTTCT 
ACCCCCACTC CATTGTATTT 
ACTCAGTGGC CAGCTCACTT 
TGATTTTGTT TGTGTCTGAG 
TAGGCAAGGT TCAGCAGCCT 
CTAGGTTTAA CTTTTGTCCC 
ACAGCAATTT CAGAGTGTAT 
CCCCTACCCC CCGTGCTCCT 
CTTGAGCCTG CTGCTGGTCT 
TAGGGGACGG TATCCTTTTT 
ATAGGCTCCA TCTTGGGCX:A 
AGCACTAAAT GTATCATGAA 
AAATGTAGAT TGATGTTCAA 
TATTAGTGGG TAGTGTAACA 
AAGTGGTGAC ACTAAATAOC 
AGCAATGAAG GATACAGTAC 
TTGGGTGTGT ATGTTTGAGG 
AGAGAAAGCA CCTTTTTCTT 
TCCTCTCCGC OTrGTOGGGT 
CTCTTATGTG TTCATAGCCA 
TCTAGTTCTA GAAAATGACC 
ACTTGTTCCA GAATTTCCGC 
TAAAGCTTTA GCTTCCCAAT 
CTCGTCAAAT ATGGAAGAGA 
OGCGTCTATC TCATAACTAG 
CTTTTGTGCT TCCAAAGTAG 
GATTAGGCAA ACATTGAGTT 
TGGGGCAGCA AGATCACTAC 
AATAATGCCC TAGTTTCTCT 
TTAGCTTGAT TTCTGGGCCX: 
TTTTTTTTCC CCCATTTAAA 
TGAATGACAT TTTATCCTTC 

cr c cnsTGiG tgtctgttct 

CTGTAATAAA G 



51 
I 

CCTGOGCTGC 
CCAAAGGGCA 
AAGAAAAACC 
TGGAAGACGA 
GTGCGCTATG 
GATCCTAATG 
CCCAAGATCA 
GAGATGTGGA 
CTGAAGGAGA 
GCAAAGGGTC 
GAGGAAGAAG 
TTGTGAATAC 
TGTTGCXXrrC 
GTGCTCTAGA 
AACTGTATCA 
TCTCCTCACT 
TTTTCTTTTT 
CAACTGTATA 
TCCTTGCTCG 
CX3TGCACTGT 
TCTTGTTTCT 
GGTCAGCTGG 
TTAAACAAAC 
GTTCAAGAAC 
TTAGAATGCT 
CTATTATAAC 
GGAGACTGGC 
AGGGCTGAGA 
TGGCATTCAG 
TCCAAGGTAT 
ACCTCCACCC 
GGTGTGTCAA 
GGCACATATC 
CCIGGATGCC 
TTGCTCCTAC 
CCTGAGCTAT 
AAGTTGAATG 
TGTTAAACTG 
TTTTATCCAG 
ATTTTGAAGG 
TGTGTTGTGG 
CTATGAAACA 
AAAATTCACT 
CCTGGATGAG 
TTCGCTCTCC 
ACTAATTTAA 
TCCTGCTTCA 
TCGTGATGTG 
AACAACCTGC 
ATGTACCAAC 
CTAAGCAGAA 
TTAAAGAGGC 
TCAMGTTTT 
GAGATGATGT 
ACTGTCTGTG 
AGGATAGTAC 
GGAAAGAACA 
TGTCACAAAT 



Seq ID NO: 306 Protein sequence: 
Protein Accession ft: NP_005333.l 



51 



60 
120 

lao 

340 

300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
ISOO 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 



1 11 21 31 41 ^ 

MAKGDPKKPK GKMSAYAFFV OTCREBHIOaC MPEVPVNPAB FSKXCSBHWK TMSOKBKSKP 
DEMAKADKVR YDREMKDYGP AKGGKKKKDP MAPKRPPSGP FLPCSEPRPK IKSTNPGISI 
GDVAKKLGEM WNNLNDSEKQ PYITKAAKLK EKYEKDVADY KSKGKPDGAK GPAKVARXKV 



60 
120 
180 



299 



wo 02/086443 



1 11 21 31 41 SI 

ATCGGTACIA GGAAAAAAGT TCATGCATTT GTCCGTGTCA AACCCACOGA TGACTTTCCT 60 

CATGAAATCA TCAGATAOGG AGATGACAAA AGAAGCATTG ATATTCACrr AAAAAAflGAC 120 

ATTOGGAXSAG GAGTTCTCAA TAACCAACAG ACAGACTGGT CGrrTAAGTT GGATGGAGTT 180 

TTCAOGATG CCTCCCAGGA CtTQGTTTAT GAGACAGTTG CAAflCGATGT GGTTTCrCAG 240 

CCCTCGATG GCTATAATCG CACCATCAIG TCSTTATGGGC AGAO G^AGC TGGCAACACA 300 

ACACCATGA TGGGGGCAAC T6AGAATTAC AAGCACCGGG GGATCCTCOC TOjrGOOCTG 360 

ACCAGGTrr TTAGGATCAT CGAAGAACGC CCCACACATG CCATCACTGT GCGTGTTTCC 420 

ACTTCGAAA TCTATAATGA GAGCCTGTTT GATCTCCTGT CCACTCTGCC CTATGTTG^ 480 

CCTCAGTCA CACCAATGAC CATCGTGGAA AACOCTCAAG GAGTCTTCAT TAAGGGCTTG 340 

CAGTTCAOC TCACAAGTCA GGAGGAGGAT GCATTCRGCC TCCnTTTGA GGGTGJCS^ 600 

ACAGGATTA TAGCCTCCCA CACTATGAAC AAAAACTCTT CCAGATCACA CTGCTJTTTC 660 

CCATCTACr TAGAGGCCCA TTCCCGGACC TTATCAGAGG AAAAGTACAT CACTTCCAAA 720 

TgS^C AGGCT^ 

TCCTGAAGG AAGCCACCTA CATCAACAAA TCGCTCTCAT TCCTOGAGCA GGCCAT^TT 8« 

CCXrrrGGGG AOCAGAAGCG GGACCACATC CCCTTTCGGC AGTGCAAGCr CACCCAOGCT 900 

TGAAGGACr CGTTAGGGGG AAACTtXZAAT ATGGTCCT^ 960 

CTGCCCACT TAGAAGAAAC GCTATCTTCA CTGAGATTTG CCACCAGGAT GAAGCTMTC 1020 

CCACTGAGC CTGCCATCAA TGAAAAGTAT GATGCTGAGA GAATGGTCAA GAACCTGGAC 1080 

AGGAACTAG CACTACTCAA GCAGGAGCTG GCTATCCATG ACAGCCTGAC CAAMMACC 1140 

TTGTGACCT ATGACCCCAT GGATGAAATC CAGATTCCTC AGATCAACTC CCAGGTCCGG 1200 

GGTACCTGG AGGGGACACT GGAOGAGATC GACATAATCA GCCTTAGACA GATCA AGGAG 1260 

TGTTCAAOC AGTTCCGGGT GGTTCTGAGC CAACAGGAAC AGGAAGTGGA GTCCACTTTG 1320 

GC3USGAACT ACACCCTCAT TGACAGGAAT GACTTTGCAG CCATTTCrGC TATCC^AAG 1380 

OGGGGCTTC TGGATGTTGA TGGCCACCTA GTGGGT6AGC CTGAAGGACA AAACTTTGGA 1440 

TCGGAGTCX3 CCCCTTTCTC TACCAAACCT GGGAAGAAAG CCAAGTCCAA GAAGACATTC 1500 

AAGAGCCAC TCAGGCCCX3A CACCCCACCC TCCAAACCAG TGGOCTTTGA GGAGTTTAAG 1560 

ATGAGCAWS CTAGTX3AGAT CAACCGAATT TTCAAAGAAA ACAAATOCAT CTTGAATGAA 1620 

GGAGGAAAA GQGCCAGCGA GACCACACAG CACATCAATG CCATCAAGCG GGAGATTGAT 1680 

TGACCAAGG AGGCCCTGAA TTTCCAGAAG TCACTACGGG AGAAGCAAGG CAAGTACGAA 1740 

ACAAGGGGC TCATGATCAT OGATGAGGAA GAATTCCTGC TGATCCTCAA GCTCAAAGAC 1800 

TCAAGAAGC AGTACCGCAG CGAGTACCAG GACCTGt3GTG AOCTCAGGGC TGAGJTCCAG I860 

ATTGCCAGC ACCTAGTCGA TCAGTGTOGC CACCX3CCTGC TCATGGAATT TCACATCTGG 1920 

ACAATX3AGT CCTTTCTCAT CCCTGAGGAC ATGCAGATGG CACTGAAGCC ACGCGGCAGC 1980 

TCCGGCCAG GCATGGTCCC TGTCAACAGG ATTGTGTCTC TGGGAGAAGA TGACCAGGAC 2040 

AATTCAGCC AGCTGCAGCA GAGGGTGCTT CXTTGAGGGCC CTGATTCCAT CTCCTTCTAC 2100 

ATGCCAAAG TCAAGATAGA GCAGAAGCAT AATTACTTGA AAAOCATGAT GGGOCTCCAG 2160 
A6GCACATA GAAAATAG 



Seq ID NO: 308 Protein sequence: 
Protein Accession # : NP_071737 

1 11 21 31 41 51 

KGTRKKVHAF VRVKPTDOFA HEMIRVGDDK RS1DIHI.KKD IRRCWNMQQ TDWSFKLDGV 60 

LHDASQDLVY ETVAKDWSQ ALDGVMGTIM CYGQTGAGKP YTMMGATEHY KHUGILPSAL 120 

QQVFRMIEER PTHAITVRVS YliEIYNESbF DLLSTLPYVG PSVTPMTIVE NPQGVPIKGL 180 

SVHLTSQEED AFSLLPEGETT NRIIASHTMN KNSSRSHCIF TIYLEAHSHT LSEEKYITSK 240 

IKLVDLACSE RLGKSGSEGQ VLKEATYINK SLSPLEQAII ALGDQKRDHI PFRQCKLTHA 300 

LKDSLGGNOl HVLVTKIYGE AAQLEBTLSS LRPASRMKLV TTEPAINEKY DAEHMVKNLE 360 

XELALliKQEL AIHDSLTNRT PVTYDPMDBl QIAEIKSQVR RYLEGTLDEI DIISLRQIKE 420 

VPNQFRWLS QQEQEVBSTL RRKYTLIDRN DFAAISAIQK AGLVDVDGHL VGEPEGQNFG 480 

LGVAPFSTKP GKKAKSKKTF KBPLRPDTPP SKPVAFEEFK HEQGSBINRI FKENKSIIiNE 540 

HRKRASETTQ HINAIKREID VTKEALNFQK SLREKQGKYE NKGLMIIDEB EFLLJLKLKD 600 

LKKQYRSEYQ DLRDLRAEIQ YOQHLVDQCR HRLLMEFDIW YNESPVIPED MQMALKPGGS 660 

IRPGMVPVNR IVSLGEDDQD KFSQLQQRVL PBGPDSISFY NAKVKIEQKH NYLKTMMGLQ 720 
QAHRK 



Seq ID NO: 309 DNA sequence 

Nucleic Acid Accession ft: CAT cluster 

1 11 21 31 41 51 

rrr r fTTTT T ttttttttaa tgcctgctgt catgctcigt ctaccagggt GAATTTCOU^ 60 

AAATTTCTGC ATAGCAATTT TAGCCAAAAC TATATATGTT CTGGGGAGGA TAGGC aTAGG 120 

CACATTCAAG ACCAAAGGAA AGAGTGAAGA AGTGTAGTTG GGTCATTGTG AATGGAIGTT 180 

TAGATTGTCA AGAAAAtTTGG GCCAGAGGCC CCACCTCACA CTAGGAQGGC AATTGCCTCT 240 

CATTAGTATC TCAGGCACCA TGGGTCTTAT TTGGTGTCAT AAGAAACACC CTCAACAAAG 300 

TAATGAACCC TCAGCCTCCA GCTTCTCTTC TTCX»GATTC TTCTTAGGGC CTCCTTTTTC 360 

CTTTTATGTT TCCAGTACCC TGAATTTCTT ATTCCCATCC CCCATTAAAA TCTGCTTCAA 420 

AGAAAAAACA ACAAGGACAC ATTCACTTTA AGATOCAAAT GAATGATAAC AGCTTAAAAC 480 
ATTATACTTA TCAGTATTAT TTGCATTTTT ATAGAAACCA AAACCATATT TCAACAAC 



Seq ID NO: 310 DNA sequence 

Nucleic Acid Accession ft: MM_018622.2 

Coding sequences 1-1140 

1 li 21 31 41 51 

I I } I 1 i 

ATGGCGTGGC GAGGCTGGGC GCAGAGAGGC TGGGGCTGCG GCCAGGOGTG GGGTGCGTOG 60 

GTGGGCGGCC GCAGCTGCGA GGAGCTCACT GCGGTCCTAA CCCCGCCGCA GCTCCTOSGA 120 

CGCAGGTTTA ACTTCTTTAT TCAACAAAAA TGCGGATTCA GAAAAGCACC CAGGAAGGTT 180 

GAACCTCGAA GATCAGACCC AGGGACAAGT GGTGAAGCAT ACAAGAGAAG TGCTTTGATT 240 

CCTCCTGTGG AAGAAACAGT C m T A TCCT TCTCCCTATC CTATAAGGAG TCTCATAAAA 300 

CCTTTATTTT TTACTGTTGG GTTTACAGGC TGTGCATTTG GATCACCTGC TATTTGGCAA 360 



300 



wo 02/086443 

TATGAATCAC TGAAAT<XAG GGTGCAGAGT TATTTTGATG GTATAAAAGC TGATTGGTTG 420 

GATAGCATAA GACCACAAAA AGAAGGAGAC TTCAGAAAGG AGATTAACAA GTGGTG6AAT 480 

AACCTAAGTG ATGGCCAGCG GACTGTGACA GGTATTATAG CTGCAAATGT (XTrGTATTC 540 

TGTTTATGGA GAGTACCTTC TCTGCAGCGG ACAATGAICA GATATTTCAC ATCGAATCCA 600 

GCCTCAAAGG TCCTTTGTTC TCCAATGTTG CTGTCAACAT TCAGTCACTT CTCCTTATTT 660 

CACATGGCAG CAAATATGTA TGTTTTGTGG AGCTTCTCTT CCAGCATAGT GAACATTCTG 720 

GCTCAAGACC AGTTCATGGC ACTCTACCXA TCTGCAGCTC TTATTTCCAA TTTTGTCaGT 780 

TACCTGGGTA AAGTTGCCAC AGGAAGATAT GGACCATCAC TTGGTGCATC TGGT GCCAT C 840 

AIGACAGTGC TOGCAGCTGT CXGGACTAAG ATGCCAGAAG GGAGGCTTGC CATTATTTTC 900 

CrTCOGATGT TCACGTTCAC AGCAGGGAAT GCCCTGAAAG CCATTATCGC CATGGATACA 960 

GCAGGAATGA TCCTGGGATG GAAATTTTTT GATCATGCGG CACATCTTGG GGGAGCTCTT 1020 

TTTGGAATAT GGTATGTTAC TTACGGTCAT GAACTGATTT GGAAGAACAG GGAGCCCCTA 1080 
GTGAAAATCT 06CATCAAAT AAG6ACXAAT GGOCOCAAAA AAGGAGGTGG CTCTAAGTaA 

Seq ID NO: 3H Protein sequence: 
protein Accession S: NP_061092.2 

1 11 21 31 41 51 

i I 1 t i i 

MAMRGWAQRG HGCGQAWGA3 VGGRSCEELT AVLTPPQLLG RRFHPPIQQK OGFRKAPRKV 60 

EPRRSDPGTS GEAYKRSALI PPVEETVPYP SPYPIRSIiIX PIiPPTVGFTG CAFGSAAIHQ 120 

YESLKSRVQS YPDGIKADWL DSIRPQKEGD FRKEINKKWN NLSDGQRTVT GIIAANVLVP 180 

CLHRVPSLQS TMIRYFTSNP ASKVLCSPML LSTPSHFSLF HMAAKMYVLW SFSSSIVNIL 240 

GQEQFMAVYL SAGVZSNFVS YLGKVATGRY GPSLGASGAI MTVI»AAVCTK IPEGRLAIIP 300 

LPHPTFTAGM ALKAIIAMDT AGMILGWKPP DHAAHLGGAL FGIHWTYGH ELIWKNREPL 360 
VKIWHEIRTN GPKRGG6SK 

Seq ID NO: 312 DJIA sequence 
Nucleic Acid Accession #: 2SM_000625 
Coding sequeacd : 19S . . 3656 

1 11 21 • 31 41 51 

1 I 1 I I 1 

CTCTC3GGCCA CCTTTGATGA GGGGACTGGG CAGTTCTAGA CAGTCCCGAA GTTCTCAAGG 60 

CACAGGTCTC TTCCTGGTTT GACTGTCCTT ACCCCEGGGA GGCAGTGCAG CCAGCTGCAA 120 

GCCCCACAGT GAAGAACATC TGAGCTCAAA TCCAGATAAG TGACATAAGT GACCTGCTTT 180 

GTAAAGCCAT AGAGATGGCC TGTCCTTGGA AATTTCTGTT CAAGACCAAA TTCCACCACT 240 

ATGCAATGAA TGGGGAAAAA GGCATCAACA ACAATGTCGA GAAAGCCCCC TGTGCCACCT 300 

CCAGTCCAGT GACACAGGAT GACCTTCAGT ATCACAACCT CAGCAAGCAG CAGAAT6AGT 360 

CXCCGCAGCC OCTCXrrGGAG ACGGGAAAGA AGTCTCCAGA ATCTCTGGTC AAGCTGGATQ 420 

CAACCCCATT GTCCTCCCCA CGGCATGTGA GGATCAAAAA CTGGGGCAGC GGGATGACTT 480 

TCCAAGACAC ACTTCACCAT AAGGCCAAAG GGATTTTAAC TTGCAGGTCC AAATCTTGCC S40 

TGGGGTCXZAT TATGACTCCC AAAAGTTTGA CCAGAGGACC CAGGGACAAG CCTACCCCTC 600 

CA6ATGAGCT TCTACCTCAA GCTATCGAAT TTGTCAACXZA ATATTACXX5C TCCCTCAAAG 660 

AGGCAAAAAT AGAGGAACAT CTGGCCAGGG TGGAAGCG6T AACAAAGGAG ATAGAAAGAA 720 

CAGTAACCTA CCAACTGACG GGAGATGAGC TCATCTTCGC CACCAAGCAG GCCTGGCGCA 780 

ATGCCCCACG CTGCATTGGG AGGATCCAGT GGTCCAACCT GCAGGTCTTC GATGCCCGCA 840 

GCTGTTCCAC TGCCCGGGAA ATGTTTGAAC ACATCTGCAG ACACGTGCGT TACTCCACCA 900 

ACAATGGCAA CATCAGGTCG GCCATCACCG TGTTCCCCXA GCGGAGTGAT GGCAAGCAGG 960 

ACTTCOGGGT GTGGAATGCT CAGCTCATCX; GCTATGCTGG CTACCAGATG CCAGATGGCA 1020 

GCATCAGAGG GGAOCCTGCC AAOSTGGAAT TCACTCAGCT GTGCATCGAC CTGGGCTGGA 1080 

AGCCCAAGTA OGGCCGCTTC GATGTGGTCC CCCTGGTCCT GCAGGCCAAT GGCCXSTGACC 1140 

CTGAGCTCTT CGAAATCCCA CCTGACCTTG TGCTTGAGGT GGOCATGGAA CATCCCAAAT 1200 

ACGAGTGGTT TCX3GGAACTG GAGCTAAAGT GGTACGCCCT GCCTGCAGTG GCCAACATGC 1260 

TGCTTGAGGT GGGCGGCCTG GAGTTCCCAG GGTGCCCCTT CAATGGCTGG TACATGGGCA 1320 

CAGAGATCGG AGTCCGGGAC TTCTGTGATG TCCAGCGCTA CAACATCCTG GAGGAAGTGG 1300 

GCAGGAGAAT GGGCCTGGAA ACGCACAAGC TGGCCTCGCT CTGGAAAGAC CAGGCTGTOG 1440 

TTGAGATCAA CATTGCTGTG CTCCATAGTT TCCAGAAGCA GAATGTGACC ATCATGGACC 1500 

ACCACTCOGC TGCAGAATCC TTCATGAAGT ACATGCAGAA TGAATACCGG TCCCGTGGGG 1560 

GCTGCCOGGC AGACTGGATT TCGCTGGTCC CTCCCATGTC TGGGAGCATC ACCCCCGTGT 1620 

TTCACCAGGA GATGCTGAAC TACGTCCTGT CCCCTTTCTA CTACTATCAG GTAGAGGCCT 1680 

GGAAAACCCA TGTCTGGCAG GACGAGAAGC GGAGACCCAA GAGAAGAGAG ATTCCATTGA 1740 

AAGTCTTGGT CAAAGCTGTG CTCTTTGCCT GTATGCTGAT GOGCAAGACA ATGGCGTCCC 1800 

GAGTCAGAGT CACCATCCTC TTTGCGACAG AGACAGGAAA ATCAGAGGCG CTGGCCTGGG 1860 

ACCTGGGGGC CTTATTCAGC TGTGCCTTCA ACCCCAAGGT TGTCTGCATG GATAAGTACA 1920 

GGCTGAGCTG CCTGGAGGAG GAACGGCTGC TGTTGGTGGT GACCAGTAOG TTTGGCAATG 1980 

GAGACTGCCC TGGCAATGGA GAGAAACTGA AGAAATCGCT CTTCATGCTG AAAGAGCTCA 2040 

ACAACAAATT CAGGTACGCT GTGTTTGGCC TCGGCTCCAG CATGTACCCT CGGTTCTGCG 2100 

CCTTTGCTCA TGACATTGAT CAGAAGCTGT CCCACCTGGG GGCCTCTCAG CTCACCCCGA 2160 

7GGGAGAAGG GGATGAGCTC AGTGGGCAGG AGGACGCCTT CCGCAGCTGG GCCGT6CAAA 2220 

CCITCAAGGC AGCCTGTGA6 AOGTTTGATG TC0GAG6CAA ACAGCACATT CAOATCCCCA 2280 

AGCTCTACAC CTCCAATGTG ACCTGGGACC OGCACCACTA CAGGCTCGTG CAGGACTCAC 2340 

AGCCTTTGGA CCTCAGCAAA GCCCTCAGCA GCATGCATGC CAAGAACGTG TTCACCATGA 2400 

GGCTCAAATC TCGGCAGAAT CTACAAAGTC CGACATOCAG CCGTGCCACC ATCCTGGTGG 2460 

AACTCTCCTG TGAGGATGGC CAAGGCCTGA ACTACCTGCC GGGGGAGCAC CTTGGGGTTT 25 20 

GCCCAGGCAA GCAGCCGGCC CTGGTCCAAG GCATCCTG6A GCGAGTGGTG GATGGCCCCA 2 580 

CACCCCACCA GGCAGTGOGC CTGGAGGCCC TGGATGAGAG TGGCAGCTAC TGGGTCAGTG 2640 

ACAAGAGGCT GCCCCCCTGC TCACTCAGCC AGGCCCTCAC CTACTTCCTG GACATCACCA 2700 

CACCCCCAAC CCAGCTGCTG CTCCAAAAGC TGGCCCAGGT GGCCACAGAA GA6CCTGAGA 2760 

GACAGAGGCT GGAGGCCCTG TGCCAGCCCT CAGAGTACAG CAAGTGGAAG TTCACCAACA 2820 

GCCCCACATT OCTGQAGGTG CTAGAGGAGT TCCCGTCCCT GCGGGTGTCT GCTGGCTTCC 2880 

TGCTTTCCCA GCTCCCCATT CTGAAGCCCA GGTTCTACTC CATCAGCTCC CCCCGGGATC 2940 

ACACGCCCAC GGAGATCCAC CTGACTGTGG CCGTGGTCAC CTACCACACC CGAGATGGCC 3000 

AGGGTCCCCT GCACCAOGGC <?rCTGCAGCA CATGGCTCAA CAGCCT6AA6 CCCCAAGAOC 3060 

CAGTGCCCTG CTTTGTGOGG AATGCCAGCG GCTTCCACCT CCCCGAGGAT CCCTCCCATC 3120 



301 



wo 02/086443 

CTEGCATCCT OVTOGGGCCT GGCACAGGCA TOGOGCCCTT OOGCAGTTrC TG GCfiGCAA C 3180 

GGCTCCATGA CTCCCAGCAC AAGQGAGTGC GGGGAGGCOG CATGACCTTG GTGTTTGGGT 3240 

GCOGCCGCCC AGATCAGGAC CAOVTCTACC AGGAGGAGAT GCTGGAGATG GCC CAGA AGG 3300 

GGGTGCTCCA TGCCSGTCCAC ACAGCCTATT CCOGCCTGCC TCGCAAGCCC AAGGTCTATG 3360 

TTCAGGACAT CXTreOGGCAG CAGCTQGCCA GCGAGGTGCT CCGTCTGCTC CACAAfiGAGC 3420 

CAGGCCWXrr CTATGTTTGC GGGGATGTGC GCATGGCCCX; GGAOGTGGCC CACACCCTGA 3480 

AGCAGCTGCT GGCTGCCAAG CTGAAATTGA ATGAGGAGCA GGTCGAGGAC TATTTCTTTC 3540 

AGCTCAAGAG GCAGAAGOGC TATCAOGAAG ATATCITTCG TGCTOTATTT CCTTAOGAGG 3600 

CGAAGAAGGA CAGGC?IGGCG GTGCAGCCCA GCftfXCTGGA GATGTCAGOG CTCTGAGGGC 3660 

CTACAGGAGG G(3TTAAAGCT GOCGGCACAG AACTTAAGGA TGGAGCCAGC TCTGCATTAT 3720 

CTGAGGTCAC AGGGCCTGGG GAGATGGAfSG AAAGTGATAT CCCCCAGCCT CAAGTCTTAT 3780 

TTCXrrCAACG TTGCTCCCCA TCaAGCCCTT TACTTGACCT CX:TAACAAGT AGCACCCTGG 3 840 
ATT6ATGG6A GOCTC 



Seq ID KO: 313 Protein sequence: 
Protein Accession fit NP_000616 



60 
120 



1 11 21 31 41 51 

I I I 1 I t 

MACPHKFLFK TKFHQYAMNG BKGINNNVEK APCATSSPVT QDDLQYHNLS KQQNESPQPL 
VETGKKSPES LVKLDATPLS SPRHVRIKNW GSGMTFODTL HHKAKBILTC RSKSCLGSIM 

TPKSLTRGPR DKPTPPDELL PQAIEFVNQY YGSLKEAKIE EHLARVEAVT KEIETrVTYQ 180 

LTOJELIPAT KQAWRNAPRC IGRIQWSNLQ VFDARSCSTA REMFEHICHH VRYSTKIK2II 240 

RSAITVPPQR SDGKHDPRVW NAQLIBYAGY QMPDGSIRGD PANVEFTQLC IDI/WKPKYG 300 

RFDWPLVLQ AHGRDPELFB IPPDIiVIiEVA MEHPKYEMFR ELEI*KWYAIiP AVANMLLEVG 360 

GLEFPGCPFN GWYMGTEIGV RDFCDVQRYN ILEEVGRRKG LETHKLASLW KDQAWEINI 420 

AVLHSFOKQN VTIMDHHSAA ESFMKYMQlffi YRSRGGCPAD WIWLVPPMSG SiTPVFHQQ4 480 

LNYVLSPFYY YQVEAWKTHV WQDEKRRPKR REIPLKVLVK AVI.PA(MLMH KTMASRVRVT 540 

ILFATEIXWS EALAWDLGAL FSCAPNPKW CMDKYRI*SCL EEBRLLLWT STFtaiGDCPG 600 

NGEKLKKSLF MLKELNNKFR YAVFGLGSSM YPRFCAFAHD IDQXLSHLGA SQLTPKGEGD 660 

ELSGQEDAFR SWAVQTFKAA CETFDVRGKQ HIQIPKLYTS NVTWEPHHYR LVQDSQPLDL 720 

SKALSSMHAK NVFTMRLKSR QNLQSPTSSR ATILVELSCE DGQGLNYLPG EHLGVCPGNQ 780 

PALVQGILER WDGPTPHQA VRLEALDBSG SYWVSDKRI.P PCSLSQALTY FLDITTPPTQ 840 

LLLQKLAQVA TEEPERQRLE ALOQPSEYSK WKPTKSPTPL EVLEEPPSLR VSAGFLLSQL 900 

PILKPRPYSI SSPRDHTPTE IHLTVAWTY HTRDGQGPLH HGVCSTWLNS LKPQOPVPCP 960 

VFNASGFHLP EDPSHPCILI GPGTGIAPFR SFWQQRLHDS QHKGVRGGRM TLVFGCRRPD 1020 

EDHIYQEQ4L EHAQKGVLHA VHTAYSRLP6 KPKVYVQDIL RQQLASEVLR VLHREPGHUY 1080 

VOGDVRMARD VAHTLKQLVA AKliKTMEEQV EDYFPQLKSQ KRYHEDIFGA VFPYEAKKDR 1140 
VAVQPSSLEM SAL 



Seq ID NO: 314 DNA sequence 
nucleic Acid Accession ft: XM_087254 
Coding sequences 47.. 2332 



1 11 21 31 41 51 

! i I i I t 

AGAGTACGTG TTTACAGATA AAACTGGTAC ACTGACAGAA AATGAGATGC AGTTTCGG6A 60 

ATGTTCAATT AATGGCATGA AATACCAAGA AATTAATGGT AGACTTGTAC CCGAAGGACC 120 

AACACCAGAC TCTTCAGAAG GAAACTTATC TTATCTTAGT AGTTTATCCC ATCTTAACAA 180 

CTTATCCCAT CTTACAACCA GTTCCTCTTT CAGAACCAGT CCTGAAAATG AAACTGAACT 240 

AATTAAAGAA CATGATCTCT TCTTTAAAGC AGTCAGTCTC TGTCACACTG TACAGATTAG 300 

CAATGTTCAA ACTGACTGCA CTGGTGATGG TCCCTGGCAA TCCAACCTGG CACCATCGCA 360 

GTTGGAGTAC TATGCATCTT CACCAGATGA AAAGGCTCTA GTAGAAGCTG CTGCAAGGAT 420 

TaSTATTGTG TTTATTGGCA ATTCTGAAGA AACTATGGAG GTTAAAACTC TTGGRAAACT 480 

GGAACGGTAC AAACTGCTTC ATATTCTGGA ATTTGATTCA GATCGTAGGA GAATGA6TGT 540 

AATTGTTCAG GCACCTTCAG GTGAGAAGTT ATTATTTGCT AAAGGAGCTG AGTCATCAAT 600 

TCTCCCTAAA TGTATAGGTG GAGAAATAGA AAAAACCAGA ATTCATGTAG ATGAATTTGC 660 

TTTGAAAGGG CTAAGAACTC TGTGTATAGC ATATAGAAAA TTTACATCAA AAGAGTATGA 720 

GGAAATAGAT AAACGCATAT TTGAAGCCAG GACTGCCTTG CAGCAGOGGG AAGAGAAATT 780 

GGCAGCTGTT TTCCAGTTCA TAGAGAAAGA CCTGATATTA CTTGGAGCCA CAGCAGTA6A 840 

AGACAGACTA CAAGATAAAG TTCGAGAAAC TATTGAAGCA TTGAGAATGG CTGGTA'KaA 900 

AGTATOGGTA CTTACTGGGG ATAAACATGA AACAGCTGTT AGTGTGAGTT TATCATGTG6 960 

CCATTTTCAT AGAACCATGA ACATCCTTGA ACTTATAAAC CAGAAATCAG ACAGCGAGTG 1020 

TGCTGAACAA TTGAGGCAGC TTGCCAGAAG AATTACAGAG GATCATGTGA TTCAGCATGG 1080 

GCTGGTAGTG GATGGGACCA GCCTATCTCT TGCSUTTCAGG GAGCATGAAA AACTATTTAT 1140 

GGAAGTTTGC AGAAATTGTT CAGCTGTATT ATGCTGTCGT ATGGCTCCAC TGCAG AAAGC 1200 

AAAAGTAATA AGACTAATAA AAATATCACC TGAGAAACCT ATAACATTGG CTGTTGGTGA 1260 

TGGTGCTAAT GACGTAAGCA TGATACAAGA AGCCCATGTT GGCATAGGAA TCATGGGTAA 1320 

AGAAGGAAGA CAGGCTGCAA GAAACAGTGA CTATGCAATA GCCAGATTTA AGTTCCTCTC 1380 

CAAATTGCTT TTTGTTCATG GTCATTTTTA TTATATTAGA ATAGCTACCC TTGTACAGTA 1440 

TTTTTTTTAT AAGAATGTGT GCTTTATCAC ACOCCAGTTT TTATATCAGT TCTACTGT TT 1500 

6TTTTCTCAG CAAACATTGT ATGACAGC5GT GTACCTGACT TTATACAATA TTTGTTTTAC 1S60 

TTCCCTACCT ATTCTGATAT ATAGTCTTTT GGAACAGCAT GTAGACCCTC ATGTGTTACR 1620 

AAATAAGCCC ACCCTTTATC GAGACATTAG TAAAAACCGC CTCTTAAGTA TTAAAACATT 1680 

TCTTTATTGG ACCATCCTGG GCTTCAGTCA TGCCTTTATT TTCTTTTTTG GATCCTATTT 1740 

ACTAATAGGG AAAGATACAT CTCTGCTTGG AAATGGCCAG ATGTTTGGAA ACTGGACATT 1800 

TGGCACTTTG GTCTTCACAG TCATGGTTAT TACAGTCACA GTAAAGATGG CTCTGGAAAC 1860 

TCATTTTTGG ACTTGGATCA ACCATCTCGT TACCTGGGGA TCTATTATAT TTTATTTTGT 1920 

ATTTTCCTTG TTTTATQGAG GGATTCTCTG GCCATTTTTG GGCTCCCACA ATATGTMTT 1980 

TGTGTTTATT CAGCTCCTGT CAAGTGGTTC TGCTTG6TTT GCCATAATCC TCATGGTTGT 2040 

TACATGTCTA TTTCTTGATA TCATAAAGAA GGTCTTTGAC CGACACCTCC ACCCTACAAG 2100 

TACTGAAAAG GCACAGCTTA CTGAAACAAA TGCAGGTATC AAGTGCTTGG ACTCCATGTG 2160 

CTGTTTCCCG GAAGGA6AAG CAGCGTGTGC ATCTGTTGGA AGAATGCTGG AACGAGTTAT 2220 

AGGAAGATGT AGTCCAACCC ACATCAGCAG ATCATGGAGT GCATOGGATC CTTTCTATAC 2280 

CaACGACAGG AGCATCTTGA CTCTCTCCAC AATGGACTCA TCTACTTGTT AAAQGQGCAG 2340 
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TABTACTTTG TGGGAGOCAG TTCAOCTOCT 
ATGGGCACAC TAGCTCTGAA ATTAKTTTOC 
GAGTTATAAT GGCAAACAAA CA(3\AAGCAT 
TGAATCTGAA CATGTTAAAA TTTGAGAATA 
TGTCCCTTGT GCTTATGGGA CTCCTAATGG 
TTTAATMCAA ATGTR6AAAA AACAGAGAAA 
TPGATTATTG ACTCTTCTAT TTAAATCTGC 
AGAACTCTAT TTTTTTATTA GAGTTATATT 
ATACTGAGGA ATTTrGGTCC CTCAGTGACC 
TTCACAGAGC AAATTAGGAG AATCATTTCC 
ATTTATACCA ATTCCTCTAA CTGTACTCTA 
CAAGGGTATA TCATATATAC AAATCAGGAA 
GTTTACTAAT ATTTTTGTGA CAGAGTATAA 
TAGCATATTA TTAATTtAAT GTCTTTATCA 
ACATATTTAA ATTTGCTTTT TTTCTCTTTA 
CATCGTTGTA CAGTTTAACT ATATCAATAA 
TATGTTTAAT TATACAAATC AGAATAGTAT 
CTCTTTCTGC AGCOGACTTA GACATGCTCT 
AGGGTTTCAG TTAATAATCT TATTTTCAGG 
CAATACAGTG AGTTCTGCCA GTGTCOCAGT 
TGTAAAAATG CTCAACTTGT ATCAGGTAAT 
TAATCGGGTA CATGTTACTG TAATTAACTC 
AATTTATCAA GTAGTTCAGT ATTGTCATTT 
TTAAGATTTA GAAGTGATTA TTAGCTTGAG 
TTGTATACAT ATTAAGATAA TGGTTAAATG 
AATTCCTTTA TGGAGATTTA TTGTGCAGCC 
GGCTTCTAGA ATTGGACTGG CAGGGGAAAG 
CTTGTTTGCT TGTAAACTAT TATTTTCTTG 
TAAGGATATT AAGTTATTAA GCTAAATATT 
GATATTTCAT AGCTGGATTT AGGAAGATCT 
CAACGTACAA TGTCTGCATT CACTAATTCA 
CTCAGTAGAG TACTAOGTGG GAGGATATGG 
TGCATATAAC AAAATGACAC CCAGTAGGCC 
GCCATCAAAT AAACTGAGTA CTGACACCAG 
TGACCAACTG CAGCAAGACA GGAGGTCAGC 
ATGTAATTTT CTGTACTCAC CATTTGAAGT 
AAGTAAATGG CAACCACTAG TGTGCTCATC 
TTAAAGCAAA ATTATCTTGT GATTTTAAGA 
CAATGCAGTC TGCAAGCTTT CAGTAGTTTT 
ACTACGTAAC CAGTAATCAC AAGGAAAGTG 
CTTTGGAAAG TATGATGTTG ATAATTAACT 
GCTAAATACG TTATTGCTAA TCAGTGGTCT 
GAGGGCTGTA AGOCTGAAGA TAGTGGCAAG 
GCTGCTTTAA GTGACTCAGC ACCCTGCCTC 
GGAGCAAAGT ATGGGCCAGG GAGAACTACA 
AGGGGAGAAT TTATGGTCTG AATTTTCTAA 
ATACACAAAG GCTTCCAGAC CTGAGCCACA 
CAGAGGCAAA TCAACCCTAG GAAATACTTG 
AGGTCATTTC TACTGGAAAA GATTGTGAGA 
ATAGCCAGGA CTCAAGGCCA CTAGAAAATT 
TGCTACTCTG AAAAATCTCG TGAAGGCTGT 
TTTTCCTGTA AAGATCAGTT TGGGGTATGA 
CAAAGAGTTA CGTAAAACAT GTTTTATTAA 
CTATTTTGAA AT6AGTTATC TATTTTCATA 
TGTGAAATAA CTTGAATGTT GTTCCTATAA 
AATCATGGTA ATTTAGATTT TTATGAGGAA 
GGTTTAAAAT TTTG6ACCTG AGACACTCTG 
TGCATTGTCA GTAAATGTAG TATATTATTG 
GAAGTTATAT TTATCAAATA AAAACTTTCC 

Seq ID NO: 315 Protein sequence: 
Protein Accession 8: XP 087254 



1 
I 

KQPRECSZNG 
NETELIKEHO 
AAARIGIVFI 
AESSILPKCI 
RBEKLAAVFQ 
SLSOGHPHRT 
EKLFMEVCRN 
GZMGKE6RQA 
QPYCLPSQQT 
SIFCTFLYWTI 
MALETHFWTW 
ILMWTCI.FL 
t^VIGRCSP 



11 
I 

MKYQEINGRL 
LFFKAVSLCH 
GNSEETMEVK 
GGEIEKTRIH 
FIEKDLILLG 
MNILELINQK 
CSAVLCCRMA 
ARNSOYAIAR 
IiYDSVYIiTLY 
liGFSHAFIFF 
INHLVTWGSI 
DIIKKVFORH 
THISRSWSAS 



21 
1 

VPEGPTPDSS 
TVQISNVQTD 
TLGKLERYKL 
VDEFALKGLR 
ATAVEDRLQD 
SDSECAEQLR 
PliQKAKVIRL 
FKFL5KLLPV 
NICFTSLPIL 
PGSYLLIGKD 
IFYFVFSLFY 
liHPTSTEKAQ 
DPFYTNDR5Z 



TTGCTAAAAT 
AAAATCTTTG 
TAGTACAAGC 
AAGAGAGATT 
CATTTCAGTC 
TCTTAGTAAA 
TTCTGTAAAT 
TAAAGCTTTT 
TGIGTTGTTA 
AACCATTATT 
ACACAGCCTG 
TCAGGTCOGT 
AGACCCTATA 
TTGGATCTTT 
CCTGAAGGCT 
AAAGTTTGGA 
GGGTAATTAA 
TCCCTTTCTA 
TTATGTCATC 
ACAAGGCATA 
GTTAGCAATA 
ATTGCACTTC 
GTrTTTGTTT 
AACTATTACC 
CGGTTTTACC 
CTAAGCTTCC 
AATGGTAGAG 
CTAATGTAAC 
AATTTTCAAA 
GTTATTCTGG 
TGTTCCAGAA 
AAATTTGCTC 
TGCATTACAT 
ACAAAGACTC 
TGGCCTATAA 
TAGTTAAGGA 
CTGAACTGTT 
AAAGAGTTTT 
CTAGTGCTAT 
TCCCCTTTGC 
TACCCTTATC 
CAAATCGATT 
CACCAAGTCA 
AGCTTCAGCA 
GCTACGAAGA 
CTGTCCTCTT 
CCCAGGCCCT 
CATTCTGCCC 
TT6AACTTAT 
GACAGTTAAG 
AGGAAAAGGG 
TATAAGCAGG 
TTTTGGTCXIC 
AAAGTAAAAC 
AAAATA6ATC 
TGAGTATCTG 
GCTGTCTAAT 
TACAGCTACT 
TATAT 



31 
I 

EQILSYLSSb 
CTGDGPWQSt? 
LHILBFDSDR 
TLCIAYRKPT 
KVBETIEALR 
QLARRITEOH 
IKISPEKPIT 
KGHFYYIRIA 
lYSLLEQHVD 
TSLLC2TGQMF 
GGILWPFLGS 
LTBTNAGIKC 
LTZ*STMDSST 



TCAGIGTGAT 
TAGTAGTTCA 
CCCTCCCAAC 
TTTCATCrCT 
TGTTGCTGAG 
GAGTATTTTT 
TATGCTGAAA 
CATGGGAAAA 
ATTCATTAAT 
TACTGCAGTA 
TAAAGTTAGC 
TCACOGAACT 
GTGGGTAAAT 
TGCATGCTTT 
C7GTGTATAG 
CAGTATTTAA 
ATGAATACAA 
TAAGCTA6AT 
TAACTTATAG 
TTTCACGTGT 
AATTAAATGC 
AAAAOCTAAC 
TAT-rCAAAAG 
CAGCTCTAAG 
AAGTTTTCCC 
TTCCCATTTC 
ACAGAAATTA 
ATTTGTCTGT 
AATAGTCCTT 
AAGTACTAAA 
GAGGAAATAA 
ATAAAATCTC 
TTACATGACC 
CAAAGTCATA 
TGGTGCTTAA 
GAACTTTATT 
ACTCCAAATC 
CTATTTATTT 
ATTCATCCTG 
ATATTTCTTT 
TGCCAAAACC 
TGCCTCCCTT 
GTTTCCAAAA 
GGCGTAGGCT 
CXTTGCrGTCG 

ATCCTGAACA 
TAOGGTTAGT 
CTGATOGCTT 
AGCCAAAAGT 
AGAATCTTCC 
TATTAATAAA 
CA08TACA6A 
ACTATTAAAG 
ATAACTCATG 
GAAATATTGT 
GTAATCCTTT 
CATAATTTTT 



41 

1 

SHLNNLSKLT 
LAPSQLEYYA 
RRMSVIVQAP 
SKEYEEIDKR 
MAGIKVWVLT 
VIQHGLWDG 
IiAVGDGAKDV 
TLVQYFFYKN 
PHVLQNKPTL 
GNWTPGTLVP 
QNMYFVFIQL 
LDSMCCFPBG 
C 



CACCCTGTTA 
TAOCCACTCA 
AOOCTTAATT 

GCCATTATAT 
TAGTATTAGC 
GTTTGCCTTG 
GTTAATGTGA 
GCATTCTGAG 
1GGG6AGTAA 
CATATAAATG 
TCAAATTGAT 
TAGATACTAT 
AATCTGGTTA 
TATTTCATGA 
ATATTGCAAA 
AAAGAAGAGC 
TTTAGAATAA 
CAAACTACCA 
GGCTGTGGAA 
TAAGAATGAT 
TTGCAIOCTG 
TAATGTTGTC 
CAAATAATGA 
TTGAAAATGT 
ATGAATATAA 
AGACTTTATC 
TCC»GTGATG 
CTTTAACTTA 
AAGAATAATA 
TGAAGATATA 
TTATAAAACG 
GTGTTTATrr 
AAATAGCCTA 
AGTGTGATTG 
TTTTTAAAAA 
CACTtXGTTT 
AAGAAAGTAA 
TAAAACrCTT 
AAAATTCTTT 
AGAGCAAAAT 
TGCCTOGTCT 
TTGCCCCTCA 
CACCCTGGGC 
AGTTGAGAAA 
AAAGCTCATA 
G6AGACTAAA 
ACCAG6ACIG 
GAGACTCCTA 
TTTTAAAATA 
ATGTTGGTGT 
AATAACACAC 
CATTTTATTT 
TGCTGTTTTA 
ATATGTTTGT 
AGCAATACTT 
AAAAATTCTC 
TAAAGTTTAT 



51 
I 

TSS5FRTSPE 
S5PDEKALVE 
SGEKLLFAKG 
IFEARTALQQ 
GDKHETAVSV 
TSLSLALREH 
SMIQEAKVGX 
VCFITPQPLY 
YRDISKNRLIi 
TVMVITVTVK 
LSSGSAWFAI 
EAACASVGRM 



2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3160 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
SlOO 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 



PCTAJS02/12476 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



Seq ID NO: 316 DNA sequence 
Kucleic Acid Accession S: NK_004473 
Coding sequence: 661.. 1791 



31 



41 



51 



X 11 21 

i i I i 1 1 

CTOGGCAGCXS GTCOGGGGGG CTGGAGACCC ACGGC3GTGGA GAGGACCAGC CTCAGGTOGC 



303 
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CCOGCCTGGG CCCGCGOCCC GACCTOGCTG CCCCXXCCTC GCCTCTCEX3C C0C3TGGCGCT 120 

TACCGCCACC TrGGCCTOGG QGGCAGGGCA TGGGCGGCCC CC3GCCAGATC GCCCAGCGCC 180 

AGTACTAACT GC OCTOSCTC TGGCCTTOGA GCGOGAAGOC TCTTCIGOGC GCACAftOCTA 240 

GGCACrraATC CTAAACTftGC GGGCACCACA GACCAGCTGC JSCCCACCOCA AOOCflGQGAT 300 

CACTTC0C5GA CCCCTOGACC GCCCGGCACC AGOGOGCAAG GGAOOCTTCA GCCGGAGACC 360 

AGAGTCCAGT CXXXGTOGCG AGGCCACOGC aSCTGCCCGC CTOGAGAACC ACaAOSCGGG 420 

CTGAGCCGTC GGCTAGOGGG TCACTCOOCSA GCCTCTGTCT GCACCGCGCC AGCCCCAGAC 480 

CAOGGAOGCr GAGCCTCCAG OGGGGOOCAG GCTGG6CCGC TGGGCTCTCC GGGCCAGCCC 540 

GOGACXSATCC CCTGAGCTCT CCGCSGAAGG GCCGAGOGTC CGTTCCGGGG ACGCCAGGCC 600 

CGCCCCOGCC CCCCXavCAGC OtSOSGOGATC CAGAGCCOGG GG6TG0GGGA CGCC OGCGCC 660 

ATGACTGCaS AGAGCGGGCC GCOSCCGCOG CAGCGGGAGS TGCTGGCTAC aTTGAAOGAA 720 

GAGOGCGGOG AGAOGGCAGC AGGGGCOSGG GTCCCAGGGG AGGCCAOGGG GOGOSGGGGG 780 

GGOGGGOGGC GCCGCAAGCG CCCCCTCCAG OGCGGGAAGC CGCCCTACAG CTACATCGCG 040 

CTCATCGCCA TGGCCATCGC GCACGCGCCC GAGCGCCGCC TCACGCTGGG CGGCATCTAC 900 

AAGTTCATCA CCGAGCGCTT CCCCTTCTAC CX3G6ACAACC CCAAAAACTG GCAGAACAGC 960 

ATCOGCCACA ACCTCACACT CAACGACTGC TTCCTCAAGA TCCOGCGOGA GGCCGGCCGC 1020 

CCGGGTAAGG GCAACTACTG GGCGCTGGAC CCCAACGOGG AGGACATGTT OGAGAGCGGC 1080 

AGCTTCCTGC GCCGCC3GCAA GOGCTTCAAG CGCTOGGACC TCTCCACCTA CCCGGCTTAC 1140 

ATGCACGACG C3GGCGGCTGC CGCAGCCGCC GCTGCOGCAG CCGCOGOOGC OGCCGCOGCC 1200 

GCCGCCATCT TCCCAGGCGC GGTGCCCGCC GOGOGCCCCC CCTACCOGGG OSCCGTCTAT 1260 

GCAGGCTAOS GGCOGCOGTC GCTGGCOGOG CCCCCTCCAG TCTACTACCC CGOGGCGTCG 1320 

CCCGGCOCTT GC CGQGTCrf CGGCCPQGTT CCTGAGCGGC CGCTCAGCCC AGAGCTGGGG 1380 

CCCGCACCGT OQGGGCCCGG OGGCTCTTGC GCCTTTGCCT OCGCOGGOGC CCCCGCTACC 1440 

ACCACCGGCT ACCACCCCGC AGGCTGCACC GGGGCCCGGC CGGCCAACCC CTCTGC CTAT 1500 

GOGGCTGCCT ACGOGGGCCC CGAOGGCGCG TACCCGCAGG GOGCCGGCAG TGCX3ATCTTT 1560 

GCCGCTGCTG GCCGCCTGGC GGGACCOGCT TCGCCCCCAC OGGGOXSCAG CAGTGGCGGC 1620 

GTGGAGACCA CGGTGGACTT CTAOGGGCGC ACGTCGCCCG GCCAGTTCGG AGCGCTGGGA 1680 

GCCTGCTACA ACCCrGGOSG GCAGCTOGGA GGGGCCAGTG CAGGCGCCTA CCATGCTCGC 1740 

CAT6CTG0C6 CTTATCCCGG TGGGATAGAT OGGT TC GTG T COGCCATGTG AGCCAGCGTA 1800 

GGGACGAAAA CTCATAGACA CATOG6CTGT TCACAOGTTC CCOGCAACCT GAGAACGAAC 1860 

AGGAATGGAG AGAGGACTCA ACTGGGACCC ACGTGGAAAA GACOGAGCAG GCCACAGAGG 1920 

CTCGGTCTCC CCGOGCACAG CGTAGGCACC CTGTGTACTC TGTAAAOGGG AGGAGGTGGG 1980 

G0GAGGCA6C CAGAGCCCTT GGACIGGCAC AGGGACCCTC GATGGAGOSA AGCCCTCAAA 2040 

OSGGATGCTT TCTGGCATTC TATOGGGGAG GGTCCTTGGC GGTAACCAGA GGGCAGCGTA 2100 

GTGTCRACAC CAGAGACCAG GATCCAAATT GTGGGGAATC AGTTTCAGOC TTCCATGTGC 2160 

TGCCGGAACT CC3GGCCTTTT TAOGCGGTTC GTCCTCTAGT GCCTTTAACT GCGTTACTAC 2220 

AATAAAAGGC TGCGGCAGCG CCTTTCTTCT TAAAGTGAGG AGGACAAATT TGCAAAAGAA 2280 

ATAGGCTTTT CTTCTTTTTT AAATTGGAGA AATCTCTGCT CTGGTTGACC TGGGCTGGTT 2340 

TTCCCTGTCT CTGAGAACTT GAGACCTAGC TCCGAGTTGA ACTGTGCGTC AGCACTCCAG 2400 

TCCCATCACC TGAACCTTCA GTCTCCCCCA TCTGTTACAC TAGAGGGCTG CAGGACTCTA 2460 

TCCACCGCCC CCGGGTTATC ATTCAGGGCC CCATCATCTT GGATGCTGCC CTGCGTATTT 2520 

GGCACCAATG GTGGGCCACC CAGGGCCTCT GA6TAGCCAC CCAAAGCCTA GCOGCTGTTC 2580 

TAGGGAAOGG AAAAGAGTTC ATGGCCAAGC GTCTAACCTA AAGTCCCAGG ATTGGCTCCA 2640 

GGCAGCAATT ATATCATAAC TTATTGAACT TTTGAGCAGG AOGTGCTGGT AATTTCATGG 2700 

CTGTTACTGC CCAGTCATAA ATCTGCTTTT CCATTATAAG GCAGAGAGAA GTACATTCGT 2760 

TCATTTGTCC ACTGTTTCTT GTCATCACGC AGCCCTGGAC CCAAAGGGTG AACTAAAGTT 2820 

TAAGGAGATG AGAGGATTCA AGGAGCCGGT TGGTGAOGOC TTTCAGTAGC TGGGGAGGGC 2880 

TCTTCCATCC CCAGCACCCC CTGCTACACC TCAGCAGCCT CCCCCATGCA AAAAGGAAAG 2940 

AGAAAAATTA AGTTAGGGCA GTCAGTAAAG TGAGCTTTAG AAAGAAACTG GAATTTTAAC 3000 

TTCATTTTGT ATCTTGCTTA AGTAGCAGGC TCACTAAAAT TAGAGAAAGT CCAATAACTC 3060 

TCCCCCTTTC CCTTGAGAAA TCTTTAAGTT TCGATTCTGG AGCAAAAACT TTCAGCATTA 3120 

AATATTTCAG AGGCTCCATT CACAGCTTTC AGATAAACTG GAGTGTTCAG ATGGACTGTT 3180 

TTAATAAAAA TCTTTGAGCA AGTGAGTTAT GGCAAGAGAA ACTCAGCCTC TTTCTGTATA 3240 

AACTTAACAG GGAAGGGCTG G6GTGTGAAA AAGAAGATTG TATGAAAACC ATTGGTAATT 3300 

TrrATTTTTT ATTTTTGGGA CTGCACTATC CTGTTCACGA AGACATGTGA ACTT66TTCA 3360 

GTCCAAA3GG GGATTTGTAT AAACCAGTGC TCTCCATTAG AAATATGGTG CAAGCCACAT 3420 

ATGTAATTTT AAATATTCTA GTAGOCACAT TAATAAAGIV AAAAGAAACA AAAAAAAAAA 3480 
AA 

Seq ID NO: 317 Protein sequence: 
Protein Accession «: NP_004464 

1 11 21 31 41 51 

I i I I I I 

FKHLTHYROI DTRANSCRIP TIONPACTOR TTFMTAESGP PPPQPEVLAT VKEBRGETAA 60 

GAGVPGEATG RGAGGRRRKR PLQRGKPPYS YIALIAMAIA HAPERRLTLG GIYKFITERF 120 

PFYRO»FXKW QtfSIRRNLTL NDCFLKIPRE AGRFGKGNYH ALDPKAEDMF ESGSFItRRRK 180 

RFKRSDLSTY PAYMHOAAAA AAAAAAAAAA AAAAAZFP6A VPAARPPYP6 AVYAGYAPPS 240 

LAAPPFVYYP AASPGPCRVP GLVPERPLSP ELGPAPSGFG GSCAFASAGA PATTTGYOPA 300 

GCtGARPANP SAYAAAYAOP OGAYPQGAGS AIPAAAGRLA GPASPPAGGS SGGVETTVDF 360 
YGSTSPOQFG ALGACYNPGG QLG6ASAGAY KARKAAAYPG GIDRFVSAM 

Seg XO NO: 318 DNA sequence 
NUcleic Acid Accession 9: NM_00S688 
Coding sequence: 126. .4439 



1 11 21 31 41 51 

i I I i I I 

CCGGGCAGGT GGCTCATGCT CGGGAGOGTG GTTGAGCGGC TGGCGCGGTT GTCCTGGAGC 60 

AGG6GCGCAG GAATTCTGAT G7GAAACTAA CAGTCTGTGA GCGCTGGAAC CTCGGCTCAG 120 

AGAAGATGAA GGATATCGAC ATAG6AAAAG AGTATATCAT CCCCAGTGCT GGGTATAGAA 180 

GTGTGAGGGA GAGAACCAGC ACTTCTGGGA OGCACAGAGA COGTGAAGAT TOCAAGTTCA 240 

GGAGAACTCG ACCGTTGGAA TGCCAAGATG CCTTGGAAAC AGCAGCCCGA GCCGAGGGCC 300 

TCTCTCTTGA TGCCTCCATG CATTCTCAGC TCAGAATCCT GGATGAGGAG CATCOCAAGG 360 

GAAAGTACCA TCATG6CTTG A6TGCTCTGA A6CCCATCGG GACTACTTOC AAACACCAGC 420 

ACOCAGTGGA CAATGCTGGG LTll- m ' O .'T GTATGACTTT TTCGTGGCTT TCTTCTCTGG 480 



304 
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CXXXjTGTGGC CCACAAGAAG GGGGAGCTCT OVATGGAAGA OGTGTGGTCT CTGTCCAAGC 540 

AOGAGTCTTC TGAOGTGAAC TGCAGAAGAC TACAGAGACT GTGGCAAGAA GAGCTGAATG 600 

AAGTDGGGCC AGACGCTGCT TCCCTGOGAA GGGTTGTGTG GATCTTCTGC OGCAOCAGGC 660 

TCATCCTGTC CATOGTGTGC CTGATGATCA OGCAGCTGGC TOGCTTCAGT GGACCAGCCT 720 

TCATGGTGAA ACACCTCrTG GAGTATACCC AGGCAACAGA (TTCTAACCTG CAGTACAGCT 780 

TGTTGTTAGT GCTGGGCCTC CTCCTGAOSG AAATGGTGOS GTCTTGGTOG CTTGCACTGA 640 

CTTGGGCATT GAATTACCGA ACOOGTGTCC GCTTGOGGGG GGCCATCCTA ACCATGGCAT 900 

TTAAGAAGAT CCTTAAGTTA AAGAACATTA AAGAGAAATC CCTGGGTGAG CTCATCAACA 960 

TTTGCTCCAA OGATGGGCAG AGAATCTTTG AGGCAGCAGC CCrTGGCAGC CTGCTGGCTG 1020 

GAGGACCOGT TGTTGCCATC TTAGGCATGA TTTATAATCT AATTATTCTG GGACCAACAG 1080 

GCTTCCTGGG ATCAGCTGTT TTTATOCTCT TTTA0CCA6C AATGATGTTT GCATCAOGGC 1140 

TCACAGCATA 7TTCAGGAGA AAATG0GTG6 COGCCAOGGft TGAA06TGTC CAGAAGATGA 1200 

ATGAAGTTCT TACTTACATT AAATTTATCA AAATGTATGC CTGGGTCAAA GCAnTTCTC 1260 

AGAGTGTTCA AAAAATCOGC GAGGACGAGC GTOGGATATT GGAAAAAGCX: GGGTACTTCC 1320 

AGGGTATCAC TGlti ^ G tf lXjTX^ GCTCCCATTG TGGTGGTGAT TGCCAGCGTX3 CTGACXTTCT 13 80 

CTGTTCATAT GACCCTGGGC TTOGATCTGA CAGCAGCACA GGCTTTCACA GTGGTGACAG 1440 

TCTTCAATTC CATGACTTTT GCTTTGAAAG TAACACGGTT TTCAGTAAAG TCCCTCTCAG 1500 

AAGCCTCAGT GGCTGTTGAC AGATTTAAGA GTTTGTTTCT AATGGAAGAG GTTCACATGA 1560 

TAAAGAACAA ACCAGCCAGT CCTCACATCA AGATAGAGAT GAAAAATGCC ACCTTGGCAT 1620 

GGGACrCCTC CCACTCCAGT ATCCAGAACT OGCCCAAGCT GACCCGCAAA ATGAAAAAAG 1680 

ACAAGAGGGC TTCCAGGGGC AAGAAAGAGA AGGTOAGGCA GCTGCAGCGC ACTGAGCATC 1740 

AGGOSGTGCT GGCAGAGCAG AAAGGCCACC TCCTCCTGGA CAGTGAC3GAG CGGCCCAGTC 1800 

CCGAAGAGGA AGAAGGCAAG CACATCCACC TGGGCCACCT GOGCTTACAG AGGACACTGC 1860 

ACAGCATOGA TCTGGAGATC CAAGAGGGTA AACTGGTTGG AATCT60GGC AGTOTGGGAA 1920 

GTGGAAAAAC CTCTCTCATT TCAGCCATTT TAGGCCAGAT GA06CTTCTA GAGGGCAGCA 1980 

TTGCAATCAG TGGAACCTTC GCTTATGTGG CCCAGCAGGC CTGGATCCTC AATGCTACTC 2040 

TGAGAGACAA CATCCTGTTT GGGAAGGAAT ATGATGAAGA AAGATACAAC TCTGTGCTGA 2100 

ACAGCTGCTG CCTGAGGCCT GACCTGGCCA TTCTTCCOUS CAGOGACCTG AOGGAGATTG 2160 

GAGAG0GA06 AGCCAACCTG AGOGGTGGGC AG06CCAGAG GATCAGOCTT GCCCX3GGCCT 2220 

TGTATAGTGA CAGGAGCATC TACATCCTGG AOGACCCCCT CAGTGCCTTA GATGOCCATG 2280 

TGGGCAACCA CATCTTCAAT AGTGCTATCC GGAAACATCT CAAGTCCAAG ACAGTTCTGT 2340 

TTGTTACCX3V CCAGTTACAG TACCTGGTTG ACTGTGATGA AGTGATCTTC ATGAAAGAGG 2400 

GCTGTATTAC GGAAAGAGGC ACCCATGAGG AACTGATGAA TTTAAATGGT GACTATGCTA 2460 

CCATTTTTAA TAACCTGTTG CTGGGAGAGA CACOGCCAGT TGAGATCAAT TCAAAAAAGG 2520 

AAACCAGTGG TTCACAGAAG AAGTCACAAG ACAAGGGTCC TAAAACAGGA TCAGTAAAGA 2580 

AGGAAAAAGC AGTAAAGCCA GAGGAAG6GC AGCTTGTGCA GCTGGAAGAG AAAGGGCAGG 2640 

GTTCAGTGCC CTGGTCAGTA TATGGTGTCT ACATCCAGGC TGCTGGGGGC CTCTrGGCAT 2700 

TCCTG6TTAT TATGGCCCTT TTCATGCTGA ATGTAGGCAG CACOGCCTTC AGCAOCTGGT 2760 

GGTTGAGTTA CTGGATCAAG CAAGGAAGCG GGAACACCAC TGTGACTOGA GGGAACGAGA 2820 

CCTCGGTGAG TGACAGCATG AAGGACAATC CTCATATGCA GTACTATGCC AGCATCTAOG 2880 

CCXrrCTCCAT GGCAGTCATG CTGATCCTGA AAGCCATTCG AGGAGTTGTC TTTGTCAAGG 2940 

GCACGCTGCG AGCTTCCTCC CGGCTGCATG ACGAGCTTTT CCGAAGGATC CTTCGAAGCC 3000 

CTATGAAGTT TTTTGACACG ACCCOCACAG GGAGGATTCT CAACAGGrrr TCCAAAGACA 3060 

TGGATGAAGT TGACXTTGCGG CTGCCGTTCC AGGCOSAGAT GTTCATCCAG AACGTTATCC 3120 

T6GTGTTCTT CTGTGTGGGA ATGATCGCAG GAGTCTTCXX: GTGGTTCCTT GTGGCAGTGG 3180 

GGCCCXrrTGT CATCCTCTTT TCAGTCCTGC ACATTGTCTC CAGGGTCCTG ATTCGGGAGC 3240 

TGAAGCGTCT GGACAATATC ACGCAGTCAC CTTTCCTCTC CCACATCACG TCCAGCATAC 3300 

AGGGCCTTGC CACCATCCAC GCCTACAATA AAGGGCAGGA GTTTCTGCAC AGATACCAGG 3360 

AGCTGCTGGA TGACAACCAA GCTCCTTTTT TTTTGTTTAC 6T6TG0GATG 0G6TGGCTGG 3420 

CTGTGCGGCT GGACXTTCATC AGCATOGCCC TCATCACCAC CA06GGGCTG ATQATCGTTC 3480 

TTATGCAOGG GCAGATTCCC CCAGCCTATG GGGGTCTOGC CATCTCTTAT GCTGTCCAGT 3 540 

TAACGGGGCT GTTCCAGTTT ACGGTCAGAC TGGCATCTGA GACAGAAGCT CGATTCACCT 3600 

CGGTGGAGAG GATCAATCAC TACATTAAGA CTCTGTCCTT GGAAGCACCT GCCAGAATTA 3660 

AGAACAAGGC TCCCTCCCCT GACTGGCCCC AGGAGGGAGA GGTGACCTTT GAGAACGCAG 3720 

AGATGAGGTA CCGAGAAAAC CTCCCTCTTG TCCTAAAGAA AGTATCCTTC ACGATCAAAC 3780 

CTAAAGAGAA GATTGGCATT GTGGGGCGGA CAGGATCAGG GAAGTCCTOG CTGGGGATGG 3840 

CCCTCTTCCG TCTGGTGGAG TTATCTGGAG GCTGCATCAA GATTGATGGA GTGAGAATCA 3900 

GT6ATATTGG CCTTGGCGAC CTCOGAAGGA AACTCTCXAT CATTOCTCAA GAGCOGGTGC 3960 

TGTTCAGTGG CACIGTCAGA TCAAATTTGG ACCCCTTCAA CCAGTACACT GAAGACCAGA 4020 

TTTGGGATGC CCT6GAGAGG ACACACATGA AAGAATGTAT TGCTCAGCTA CCTCTGAAAC 4080 

TTGAATCTGA AGTGATGGAG AATGGGGATA ACTTCTCAGT GGCGGAACGG CAGCTCTTGT 4140 

GCATAGCTAG AGCCCTGCTC CGCCACTGTA AGATTCTGAT TTTAGATGAA GCCACAGCTG 4200 

CCATGGACAC AGASACAGAC TTATTGATTC AAGAGACCAT CGGAQAAGOV TTTGCAGACT 4260 

GTACCATGCT GACCATTGCC CATCGCCTGC ACACGGTTCT AGGCTCGGAT AGGATTATGG 4320 

TGCTGGCCCA GGGACAGGTG GTGGAGTTTG ACACCCCATC GGTCCTTCTG TCCAACGACA 4380 

GTTCCOGATT CTATGCCATG TTTGCTGCTG CAGAGAACAA GGTCGCTGTC AAGCGCTGAC 4440 

TCCTCCCTGT TGACGAAGTC TCTTTTCTTT AGAGCATTGC CATTCXOTGC CTGGGGCGGG 4S00 

COCCTCATOG CGTCCTCCTA CCGAAACCTT GCCTTTCTCG ATTTTATCTT TOGCACAGCA 4560 

GTTCC3GGATT GGCTTGTGTG TTTCACTTTT AGGGAGAGTC ATATTTTGAT TATTGTATTT 4620 

ATTCCATATT CATGTAAACA AAATTTAGTT TTTGTTCTTA ATTGCACTCT AAAAGGTTCA 4680 

G0GAACC3GTT ATTATAATTG TATCA6AGQC CTATAATGAA GCTTTATAOG TGTAGCTATA 4740 

TCTATATATA ATTCTGTACA TAGCCTATAT TTACAGTGAA AATGTAAGCT GTTTATTTTA 4800 

TATTAAAATA AGCACTGTGC TAATAACAGT GCATATTCCT TTCTATGATT TTTGTACAGT 4860 

TTGCTGTACT AGAGATCTGG TTTTGCTATT AGACTGTAGG AAGAGTAGCA TTTCATTCTT 4920 

CTCTAGCTGG TGGTTTCAOG GTGCCAGGTT TTCTGGGT6T CCAAAGGAAG ACGTGT6GCA 4980 

ATAGTGGGCC CTCCGACAGC CCCCTCTGCC GCCTCCCCAC AGCOGCTGCA GGGGTGGCTG 5040 

GAGACGGGTG GGOGGCTGGA GACC ATGCAG AGGGCGGT6A GTTCTCAGGG CTCCTGCCTT 5100 

CTGT CCTGGT GTCACTTACT GTTTCTGTCA GGAGAGCAGC GGGGOGAAGC CCAGGCCCCT 5160 

TTTCACrCCC TCCATCAAGA ATGGGGATCA CAGAGACATT CCTCCGAGCC GGGGAGTTTC 5220 

TTTOCTGCCT TCrrCTTTTT GCTGTTGTTT CTAAACAAGA ATCAGTCTAT CCACAGAGAG 5280 

TCCCftCTGCC TCAGGTTCCT ATGGCTGGCC ACTGCACAGA GCTCTCCAGC TCCAAGACXTT 5340 

GTTG6TTCCA AGCCCT6GAG CCAACTGCTG CTTTTTGAGG TGGCACTTTT TCATTTGCCT 5400 

ATTCCCACAC CTOCACAGTT CAGTGGCAGG GCTCA6GATT TCGTQGGTCT GrTTTCCTTT 5460 

CTCAOOGCAG TCGTCGCACA GTCTCTCTCT CTCTCTCCCC TCAAAGTCTG CAACTTTAAG 5520 

CACCTCTTGC TAATCAGT6T CTCACACTG6 06TAGAAGTT TTTGTACTGT AAAGAGAC3CT 55 SO 

ACCTCAGGTT GCTGGTTGCT GTGTG6TTTG GTGTGTTOCC GCAAACCCXX: 'mtriXjCIGT 5640 

G6GGCTGGTA GCTCAGGTGG GOGTGGTCAC TGCTGTCATC AGTTGAATGG TCAGGGTIGC 5700 
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ATCTCGTGAC CRACTArSACA TTCTGTCCCC TTflG CATCTf TGCTGAACAC CrrG]JGGAAC S760 
CAAAAATCTG AAAATGTGAA TAAAATTATT TTGGATTTTG TAAAAAAAAA AAAAAAAAAA 5820 
AAAAAAAAAA AAAAAAAA 



Seq ID KO: 319 Protein sequence: 
Protein Accession #: NP_005679 

1 11 21 



31 41 51 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



ilKDIDIGKEY llPSPGYRSV RBRTSTSGTH RDREDSKFRa TRPLECQDAL BTAARAEGLS 
LDAS>KSQLR MEEHPKGK YHHGLSALKP IRTTSKHQHP VDNAGLPSOj JPSWbSSLAR 
VAHKKGEI^ EDVWSLSKHE SSDVNCRRLE RLHQEELNBV GPDAASLRRV WIP^TRM 
LSIVaJ4ITQ LAGFSGPAFM VKHLLEYTQA T3SNLQYSLL LVLGLLLTBI VRSWSLALTW 
AUWRTGVRL RGAILTMAFK KILKLKNIKE KSLGELINIC SNDGQRMPEA AA^SLL^ 
PWAILGMIY NVIILGPTCF LGSAVPILPY PAMMFASHLT AYFRHKCVAA TDERVQKHNE 
VLTYimKM YAMVKAPSQS VQKIRBBERR rLEKAGYFQG ITVGVAPIW VIASWTFSV 
SS^S I^FtSvtSf NiriTFALKVT PFSVKSLSEA SVAVDRP^ ^5^^^^ 
KKPASPHIKI EKKKATLAWD SSHSSIQNSP KLTProOKDK RASRGKKBKV RQ^™QA 
VLAEQKGHLL LDSDERPSPE EBEGKHIHLG HLRLQRTiaS IDLEIQEGKt VGI0GSVG9G 

KTSLISAILG QMTLLEGSIA ISGTFAYVAQ QAWILNATLR D5JILFGKEYD EfaYNSVLNS 660 

CCLRPDLAIL PSSDLTEIGB RGANLSGGQR QRISLARALY SDRSIYILDD PJ^M^OVG 720 

HHIFNSAIRK HLKSKTVLFV THQIiQYLVDC DEVIFMKEGC ITERGTHEEL MNLNGDYATI 780 

FNNLLLGBTP PVEINSKKET SGSQKKSQDK GPKTGSVKKE KAVKPEEGQL VQLEEiCGQGS 840 

VPWSVYGVYI QAAGGPIAFL VIMALFMMIV GSTAPSWJL SYWIKQGSGN TTVTRQIETS 900 
VSOSMKDNPH MQYYASIYAL SMAVMLILKA IRGWFVitGT LHASSRLHDE S^JJf™ 

KFFDTTPTGR ILNRPSKDJ© EVDVRLPTOA EMFIQNVILV PPCVGMIAGV FPWFLVAVGP 1020 

I.VILPSVLHI VSRVLIRELK RLDNITQSPF LSHITSSIQG LATIHAYNKG QEFLHRYQEI. 1080 

LDTOIQAPFFI. FTCAMRWLAV RU^LISIALI TTTGLMIVLM HGQIPPAYAG IAISYAVQl|T 1140 

GLFQFTVRLA SETEARPTSV ERINHYIKTL SLEAPARIKN KAPSPDWPQE GEVTFENAaj 1200 

RYRENLPLVL KKVSFTIKPK EKIGIVGRTG SGKSSLGMAL FRLVBLSGGC IKIDGVRISD 1260 

IGLADLRSKL SIIPQEPVLP SGTVRSMU>P FNQYTEDQIW DAI^RTHMKB CIAQLPLKLB 1320 

SHVMENGDNF SVGERQLLCI ARALLRHCKl LILDBATAAM OTETDIiLIQB TIREAFADCT 1380 
MLTIAHRLHT VLGSDRIMVL AQGQWEFDT PSVLLSMDSS RPYAMPAAAE NKVAVKG 

Seq ID NO: 320 DNA sequence 

Nucleic Acid Accession S: AK022089.1 

Chiding sequence: 181-1488 

1 U 21 31 41 SI 

icCAGTrGCA CAACTTCCAG CAACTTTCTC AGCCGGCTAC TAATGAGCTG AAAGCCAGGA 60 

ACATCOGAGG AGAAGAGAAA GCTTCCAGCC CTCCTOCCTT CAOCCTGGAA ATCCAGACAC 120 

CCCCACCCCC ACCCTCAGAT CACTTTAAGA TAATTTCTTT AITCGTITGC CCGACAGACC 180 

ATGGCTCCCT TTGGAAGAAA CTTGCTAAAG ACTCGGCATA AAAACAGATC TCCAACTAAA 240 

GACATGGATT CAGAAGAGAA GGAAATTGTG GTTTGGGTTT GCCAAGAAGA GAAOCTTGTC 300 

TGTGGGCTGA CTAAACGCAC CACCTCTGCT GATGTCATCC AGGCTTTGCT TGAGGAACAT 360 

GAGGCTACGT TTGGAGAGAA AC6ATTTCTT CTGGGGAAGC CCAGTGATTA CTGCATCATA 420 

GAGAAGTGGA GAGGCTCOGA AAGGGTTCTT CCTCCACTAA CTAGAATCCT GAAGCTTTGG 480 

AAAGCGTGGG GAGATGAGCA GCCCAATATG CAATTTGTTr TGGTTAAAGC AGATGCTTTT 540 

CTTCCAGTTC CTTTGTGGCG GACAGCTGAA GCCAAATTAG TGCAAAACAC AGAAAAATTG 600 

TGGGAGCTCA GCCCAGCAAA CTACATGAAG ACTTTACCAC CAGATAAACA AAAAAGAATA 660 

GTCAGGAAAA CTTTCCGGAA ACTGGCTAAA ATTAAGCAGG ACACAGTTTC TCATGATC6A 720 

6ATAATATG6 AGACATTAGT TCATCTGATC ATTTCCCAGG ACCATACTAT TCATCAGCAA 780 

GTCAAGAGAA T6AAAGAGCT GGATCTGGAA ATTGAAAAGT GTGAAGCTAA GTTCCATCTT 840 

GATCGAGTAG AAAATGATGG AGAAAACTAT GTTCAGGATG CATATTTAAT GCCCAGTTTC 900 

AGTGAAGTTG AGCAAAATCT AGACTTGCAG TATGAGGAAA ACCAGACTCT GGAGGACCTG 960 

AGCGAAAGTG ATCGAATTGA ACAGCTGGAA GAACGACTGA AATATTACCG AATACTCATT 1020 

GATAAGCTCT CTGCTCAAAT AGAAAAAGAG GTAAAAAGTG TTTGCATTGA TATAAATOAA 1080 

GATGCGGAAG GGGAAGCTGC AAGTGAACTG GAAAGCTCTA ATTTAGAGAG TGTTAAGTGT 1140 

GATTTGGAGA AAAGCAXpAA AGCTGGTTTG AAAATTCACT CTCATTTGAG TGGCATCCAG 1200 

AAAGAGATTA AATACAGTGA CTCATTGCTT CAGATGAAAG CAAAAGAATA TGAACTCCTG 1260 

GCCAAGGAAT TCAATTCACT TCACATTAGC AACAAA6ATG GGTGCCAGTT AAAGGAAAAC 1320 

AGAGCGAAGG AATCTCAGGT TCCCAGTAGC AATGGGGAGA TTOCTCCCTT TACTCAAAGA 1380 

GTATTTAGCA ATTACACAAA TGACACAGAC TOGGACACTG GTATCAGTTC TAACCA^ 1440 

CAGGACTCCG AAACAACAGT AGGAGATGTG GTGCTGTTGT CAACATAGTT CCAATGGCTC ISOO 

CTTTCTGACC TX;CTTTCATG TTTTAATGTT TGTTTAATTT AATAGGAAAC CTCATTTTAA 1S60 

ATATAACACT CAAAAAAATG TAAATCATAT T6TACTATTC AATAGTTAAT AAAAACTCGA 1620 
GAAATGTGTT GTTTCTG 

Seq ID NO: 321 Protein sequence: 
Protein Accession ft: NP_00S438.1 

1 11 21 31 41 51 

MAPFGRNLLK TRHXNRSPTK DMDSEEKEIV VWVCQEEKLV CGLTKRTTSA DVIQALLEEH 60 

EATFGEKRFL LGKPSDYCII EKWRGSERVL PPLTRILKtW KAWGDBQPNM QPVLVKADAP 120 

LPVPLWRTAE AKLVQNTEKL WELSPANYMK TLPPDKQKRI VRKTPRKIiAK IKQDTVSHDR 180 

DNMETLVHbl ISQDHTIHQQ VKRMKELDLE lEKCBAKFHL DRVEKDGKNY VQDAYLMPSF 240 

SEVEQNXiDLQ YEENQTLEDL SESDGIEQLE ERLKYYRILI DKLSAEIEKE VKSVCIDINE 300 

DAEGEAASEL ESSNLESVKC DLEKSMKAGL KIHSHLSGIQ KEIKYSDSU. QMKAKBYm.L 360 

AKEFNSLHIS NKDGCQLKEN RAKESBVPSS NGEIPPPTQR VFSNYTNDTD SDTGISSNHS 420 
QDSETTVGDV VLLST 

Seq ID NO: 322 DNA sequence 

Nucleic Acid Accession ft: nm_030920.1 
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(^KliJig sequence: 317-1123 

1 11 21 31 |1 51 

Igcattgaag GGGAAGGAAC TGCGGGTGTG GTGTGTGTAT GTGTGTGTGT ATGTGTGTGC 60 

GGCGOGTGCG TGOGTGTGTG TGOGCGOGCT AGTGTGTGGA CAAGGAGGTG GGGGCAGCTG 120 

AGTTAGACTC CCAACTCTTG GACTCCATTT GCTATTCTCr TCTTTCTCCC CC ACACCTAT 180 

CTGGTGGTGG TAGTGGGOGT TTATATTTGC GTTCCTTTTC ATTCATTTCT AAATCTCTTA 240 

AAAATTTTCG GTTGGGGGTA TTGGGGAAGG CAGGAAAGG6 AAAAGGAGAfi TAGTAGCTGA 300 

AGAGCAAbAG GAGGACRTGG AGATGAAGAA GAAGATTAAC CTGGAGrTAA GGAACAGATC 360 

CCOGGAGGAG GTCACAGAGT TAGTCCTTGA TAATTGCCTG TGTGTCAATG GGGAAATTGA 420 

AGGCCTGAAT GATACTTTCA AAGAACTAGA ATTTCTGAC5T ATGGCTAATG TGGAACTAAG 480 

TTCGCTCGCC CGGCTTCCCA GCTTAAATAA ACITOGAAAA TTGGAGCTTA GTGATAATAT 540 

AATTTCTGGA GGCTTGGAAG TCCTGGCAGA GAAATGTCCA AATCTTAOCT ACCTCAATCT 600 

GAGTGGAAAC AAAATAAAAG ATCTCAGTAC AGTAGAAGCT CTGCAAAATC TTAAAAATTT 660 

GAAAAGTCTT GACCTGTTTA ACTGTGAGAT CACAAACCTG GAAGATTATA GAGAAAGTAT 720 

TTTIGAACTA CTCCAGCAAA TCACATACTT AGATGGATTT GATCAGGAGG ATAATCA^ 780 

GCCGGACTCT GAAGftGGAGG ATGATGAGGA TGGAGATGAA GATGATGAAG AGGAAGAGGA 840 

AAATGAAGCT GGTCCACCGG AAGGATATGA QGAAGAGGAG GAGGAAGAGG AAGAGGAGGA 900 

TGAGGATGAG GATGAAGATG AAGATGAAGC AGGTTCAGAG TTGGGAGAGG GAGAAGAGGA 960 

AGTCGGCXrrC TCATACTTAA TGAAAGAAGA AATTCAGGAT GAAGAAGATG ATGATGAC» 1020 

TGTTGAAGAA GGGGAAGAAG AGGAAGAAGA GGAAGAAGGA GGTCTTC6AG GGGAGAAGAG 1080 

GAAACGAGAT GCTGAAGACG ATGGAGAGGA AGAAGATGAC TAGATCATTC TAAGACCAGA 1140 

TTCTCTAATG TTTCroGGTC TGCAATAGAG TGATCACATC TTTGTTTCTT CATGTAOGAT 1200 

AGCTATCCCT ACAGAAGATA ATCTGTAACT TTTTATAGGA AAAGTGTGGT TTTACTATTT 1260 

TTGCCTTATC ATTCCAAATA AGftACTWSTC TCTTAATGAT CATATTGTAT GTAGRGAA AA X320 

ATTTTCATTG ACTCCCATTG TCGAATTCCC TAGCAATTTA TTTAGACTTA ATTTTTTAAA 1380 

TTCAAGCTTA CTGTATTAGT CATTTTTAGC CCATAATTAA AACATGATCA CTTTTAAACA 1440 

GGTGTAGTAT GGTGCATTTC ATTCCTTATT TATAGATTAA CTGAAATTAC AGTTTGCTAT 1500 

AATATAAAAT GACAATAGTC TCTTGAGTGG TAAGTTGGTT ATTTTTTTAG AGGTGATCCA 1560 

GGAATCTTTA GTrCGAAGGC AGTTAOCTTT TTTTTTTTTT TTTTTTTTTG ACTAAGfl^ 1620 

TTTGGTOCT TTTTTGTCAC AAGTAACTTG GAAAATAGAA GCAGAATAGT AAAGGTTCTA 1680 

TTCAGCAACA TAGTTCATCG ATTTTGTGGA GGTTCTATTC AGTAATATGG TTCATGGATT 1740 

TAGTGGTGAC TGATAAGATT TTATTTTTGA AGGAAAAATT GCTTATACTA AGTCC A GAg L 1800 

CATGCAGGTG AGOCCTTTTG TCAGGCTGCA AATCATGACA TGCCGATGGT TGTTTATTTT 1860 

GTTTTTAGGT GTGCATTCTT TTTCTTCTTA GCAATTCCTT TATGATCACC TTCCCTTCTT 1920 

GTTTCACTCC CTCCCGCTCT CTCAAAAGGA ACTTGGGAAA CTTGTGAAAC CCAGGAAAAC 1980 

CTTTAGTCTT ATACCTCAAC TACX3TTTCAG TCCTGTCTGG GTTTTAAATA AGTGAAGTAG 2040 

AAGAAATTGA GTATTTTCTG ACATAAGAAT ATATTATCAA TACAGTTTTA TGCAGTAAGC 2100 

TCTCCTTAOC ATAAATGTTT CTTGGTTGAC AACATCTAAG ACAATATTAG TGGGATOAAG 2160 

AAAGAAAAGC AGGGGTGCTT TTGGAAGCAG TGTTAGTGTT CCTCAAAAGT CGGAACAATT 2220 

GCCTGTTGAT ATATTAATAA GACATTAAAG TCAAATTTTA ATGTTGGCCT CTCAAATGAT 2280 

TTGGATACCA CTCTGCAAAG TATTTCTAAC CTTTAATTCC CAGTTTTAAA ACAGATATAA 2340 

TAATAGCATT TAATTGQAAT ATACTAGGCA GCTG6AAAAG TATTTGAAAC TAAATTGACA 2400 

TTAAAATTAA GATTTGTTTT CAAGTGGATG TCCATTAAAA GTAGAAAAAT ATTTGGGATA 2460 

AGTGAGTGTG TGTTTCCTTA CATGGCTACT AAATAAAATA TAATGAGTAT ACAAGTATAT 2520 

CTCCTCTTTT GCTATGGAGG CTCCATGTTC AAGGCAATGG CTTTTTAAAT CTTGGCTATC 2580 

TAAAATTTTT TCCCTTTGTT TTGAATATTT GTAAGTTTTT AAGAAGTTAG TGTCAGCAAA 2640 

TTAATTGAAG TTATGCTTCT ATACTGGGAC ATATTTAAAT ACTGAGTATA GTACTGCTGC 2700 

TACTGCTTCT ACAATCTAAA ATGTATGACT TGGTGTTTTA AAGTAAAAAT TATGATGTTA 2760 

CTTGTGGAGA AAGTAAAAAT 6TTGTACAAC TGACCGAAAG AAAAC CCTTG GGGATAAGTT 2820 

TAGTGAGGGG ATTGGAATCC CCAAAAAGAT AACATTTTTC TTCTGCrTTT AAAAACTGAA 2880 

ATTCCCTGTT CTAGTTCCTA ACAATTCTCA TTACATACTA TGCCAGATTA CAAAATACTT 2940 

ATTTTTAAAA TGAAATCTAT ATATTGACTT TCTTATCAAT CATCTTACTG TGCAATCAAA 3000 

ATTAGAGTAC TTTGGTTTGA AAACAACACT TAGAGCCTCC AGAT AACT TT T AAGACTT AT 3 060 

TTAGCTTTGT GGGTGGTATT TTCATCCAAA TAAGTAAGGG TGGGTTTTAT ATTTTGTAGA 3120 

AGTTTTOGGT CCTATTTTAA TGCTCTTTGT ATGGCAGTAT GTATATATTG TGTTAAGTTC 3180 

CTCAAGAATC TCCTTAAAAA CTTTGAAGTT AATACTTTTG TGCAACTGTG TTTTGAATAA 3240 
AGCCATGACA GTGTTAAAAA CAAAC 

Seq ID MO: 323 Protein sequence: 
Protein Accession #: NP_1121B2.1 

1 11 21 31 41 51 

MEMKKKINLE LnRSPBEVT ELVLDNCLCV NGEIBGLUDT FKELEFI.SMA NVELSSLARL 60 

PSLNKLRKLE LSDNIISGGL EVtAEKCPNL TYINLSOTKI KDLSTVEALQ NLKNLKSLDL 120 

FNCEITNLED YRESIFELLQ QITYLDGFDQ EDNEAPDSEE EDDEDGDEDD EEEEENEAGP 180 

PBGYEEEEEE EEEEDEDEDE DEDEAGSELG EGEEEVGLSY LMKEBIQDEE DDDDYVEEGE 240 
EEEEEEEGGIi SGEKRKRDAB DDGEEEDD 

seq ID NO: 324 DNA sequence 
Nucleic Acid Accession ft: NM_003812 
Coding sequence: 224.. 2722 

1 11 21 31 41 51 

TCCTCTGCGT CC0GCCCCX3G GAGTGGCT6C GAGGCTAGGC GAGCCGGGAA AGGGGGOGGC 60 

GCXXy^GCCCC GAGCCCCGCG CCCOGTGCCC OGAGCCOGGA GCCOCCTGCC OGOGGOSSCA 120 

CCATGCGCGC CGAGCCGGCG TGACCGGCTC CGCCCGCGGC C33CCCCGCAG CTAGCCX3GGC 180 

GCTCTCGCCG GCCACACGGA GOGGOGCCCG GGAGCTATGA GCCATGAAGC CGCCOGGCAG 240 

CAGCTCGCGG CAGCCGCCCC TQGCGGGCTG CAGCCTTGCC CGCGCTTCCT GCGGCCCCCA 300 

ACGOGGCCCC 6C0GGCTGGG TGCCTGCCftG C60CC0GG0C CGCACGCOGC CCTGCCGCCT 360 

GCTTCTCGTC CTTCTCCTGC TGCCTCOGCT 0GCCG0CT03 TCCCGGCCCC GCGCCTGGGG 420 

GGCTGCTGCG CCCAGGGCTC CGCATTGGAA TGAAACTGCA GAAAAAAATT TGGGAGTOCT 480 

GGCAGATGAA GACAATACAT TGCAACAGAA TAGCAGCAGT AATATCAGTT ACAGCAATGC 540 

AATGCAGAAA GAAATCACAC TGCCTTCAAG ACTCATATAT TACATCAACC AAGACXOGGA 600 
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660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1S60 
1620 
1680 
1740 
1800 
1860 
1920 
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2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 



Seq ID NO: 325 Protein sequence: 
Protein Accession ft: HP_003803 



1 



IX 



21 



31 



41 



51 



MKPPGSSSRQ PPLAGCSIAG ASCGPQRGPA GSVPASAPAR TPPCRLLLVL LLLPPLAASS 60 

RPRAHGAAAP SAPHWNETAE KNLGVLADED ^P^LQQNSSSN ISYSNAKQKE ITLPSRI.IYY 120 

INQDSESPYH VIJ3TKARHQQ KHNKAVHLAQ ASPQIEAFGS KFIMLIUIM GLLSSDYVBI 180 

HYENGKPQYS KGGEHCYYHG SIRGVKDSKV ALSTCNGLHG MFEDDTFVYM lEPLELVHDE 240 

KSTCBPHIIQ KTIAGQYSKQ MKNLTMERGD QWPFLSELQW LKRRKRAVNP SSGIFEEMKY 300 

LELMIVKDKK TYKKHRSSHA HTNNPAKSW NLVDSIYKBQ LNTRWLVAV ETWTEKDQID 360 

ITTOPVQMLH EFSKYRQRIK QHADAVHLIS RVTPHYKRSS LSYFGGVCSR TRGVGVNEYG 420 

LPMAVAQVLS QSLAQNLGIQ HEPSSRKPKC DCTESHGGCI MEETGVSHSR KFSKCSILEY 480 

RDFLQRGGGA CLFNRPTKLF EPTECGNGYV EAGEEOX^GP HVECYGLCCK KCSLSKGAHC 540 

SDGPCCNNTS CLFQPRGYEC RDAVNECDIT EYCTGDSGQC PPNLHKQDGY ACNQNQGRCY 600 

NGECKTRDNQ CQYIWGTKAA GSDKFCYEKL NTEGTEKGNC GKDGDRWIQC SKHDVFCGFI, 660 

IiCTNLTRAPR IGQLQGEIIP TSFYHQGRVI DCSGAHWLD DDTDVGYVED GTPCGPSMMC 720 

U)RKCLQIQA UJMSSCPLDS KGKVCSGHGV CSNEATCIC3> PTWAGTDCSI ROPVRNIJIPP 780 
KDEGPKGPSA TNLIIGSIAG AILVAAIVLG GTGWGFKMVK KRRFDPTQQG PI 

Seq ID NO: 326 DMA sequence 

Nucleic Acid Accession ft; AK074418.1 

Coding sequence: 244-1515 

1 11 21 31 41 SI 

CTTTCTCCAA GACGGCCGGC CATGCTCTCC TCCTCTGCCA GTCTCXTTCCA CCACTCTCTA 60 

ACCTGAGAGC CTGTGGAACC TGCCCGTCTC CCCTCCTCCA TCAGACACAC CTGCCTAGGA 120 

AACAGATGGA AAAAGTGAGG GACCGGTGAG TGACTTGCTG CTAAACTTTA TACCAGATGC 180 

AAATGACAGA GCTGGAGTTC TCCTCTGCCT GGAAAGGACC TCGGAAGTCT TCTAAGGAQA 240 

GTCATGGCX5T ATTACCAGGA GCCTTCAGTG GAGACCTCCA TCATCAAGTT CAAAGACCAG 300 

GACTTTACCA CCTTGCGGGA TCACTGCCTG AGCATGGGCC GGACGTTTAA GGATGAGACA 360 

TTCCCOGCAG CAGATTCTTC CATAGGCCAG AAGCTGCTCC AGGAAAAACG CCTCTCCAAT 420 

GTGATATGGA AGCGGCXZACA GGATCTACCA GGGGGTCCTC CTCACTTCAT CCTGGATGAT 480 

ATAAGCAGAT TTGACATCCA ACAAGGAGGC GCAGCTGACT GCTGGTTCCT GGCAGCACIG 540 

GGATCCTTGA CTCAGAACCC ACAGTACAGG CAGAAGATCC TGATGGTCCA AAGCTTTTCA 600 

CACCAGTATG CTGGCATTTT CXX5TTTCCGG TTCTGGCAAT GTGGCCAGTG GGTGGAAGTG 660 

GTGATTGATG ACCGCCTACC TGTCCAGGGA GATAAATGCC TCTTTGTGCG TCCTCGCCAC 720 

CAAAACCAAG AGTTCTGGCC CTGCCTGCTG GAGAAGGCCT ATGCCAAGCT GCTOGGATCC 780 

TATTCCGATC TGCACTATGG CTTCCTCGAG GATGCCCTGG TGGACCTCAC ACGAGGCQTG 840 

ATCACCAACA TCCATCTCCA CTCTTCCCCT GTGGACCTGG TGAAGGCAGT GAAGACAGOG 900 

ACCAAGGCAG GCTCCCTOAT AACCTGTGCC ACTCCAAGTG GGCCAACAGA TACAGCACAG 960 

GCGATGGAGA ATGGGCTCGT GAGTCTCCAT GCCTACACTG TGACTGGGGC TGAGCAGATT 1020 

CAATACCGAA GGGGCTGGGA AGAAATTATC TCCCTGTGGA ACCCCTGGGG CTGGGG OGAG 1080 

ACGGAATGGA GAGGGOSCTG GAGTGATGGG TCTCAGGAGT GGGAGGAAAC CTCTGATCOS 1140 
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OGGAAAAGCC AGCTACATAA GAAAOGGGAA GATGGOGAGT TTTGGATGTC GTGTCAAGAT 1200 

TTCCAACAGA AATTCATCGC CATGTTTATA TGTAGOGAAA TTCCAATTAC C CTG^ CCAT 1260 

GGAAACACAC TCCACGAAGG ATGGTCOCAA ATAATGTTTA GGAAGCAAGT GATTCTAiSGA 1320 

AACACTGCAG GAGGACCTOG GAATGATGCT CAATTCAACT TCTCTGTGCA AGftfi CCAATG 13S0 

GAAGGCACCA ATGTTGTCGT GTGOTrCACA GTTGCTGTCA CACCATCAAA TTTGAAAGCA 1440 

GAAGATGCAA AATTTCCACT CGATTTCCAA GTGATTCTGG CTGGCTCACA GAAACACTGT 1500 

CCAAAGCTCA AATAATAAAT TCCGCOSCAA CTTCACCATG ACTTACCATC TGAGCCCTGG 1560 

GAACTATGTP 6TGGTTGCAC AGACACGGAG AAAATCAGCG GAGTrCTTGC TCCGAATCTT 1620 

CXTTGAAAATG CCACACACTG ACAGGCACCT GAGCAGCXy^T TTCAACCTCA GAATGAAGGG 1680 

AAGCCCTTCA GAACATGGCT CCCAACAAAG C3^TTTTCAAC AGATATGCTC AGCAGGTATG 1740 

GTACCTAGCA CCCAGGGGCC TTACGTGGGA TTGGAGAAAG GGGACCTGAG G GAGGG ACAG 1800 

CCCTCACAGG CCCTTACTGG GATGCAGAGA GGAGAAGTGA CTTGATGGAC TATTTT^CT 1B60 

GCXrrCTCTTC CTGGATOGTC TCCAGAACTG CTGTGGCTGC CAAGCTOGGT AGAGACGTGG 1920 

CGCCOCACCC AGTCTCATCC GGGGGACTTC AAGCTGGAAT GCACAGCTTA GAAAGGGAGG 1980 

GGATAATTAT GGGGTGTGAG GTCCATTGCC CTCTAAATCT TTAAACAAGC AATTGGCAGT 2040 

ACCCCGTGAA ACCTTTCCrr CTCCtACTC^ GCCACCTCCC ACCARCCTGG CATCGTTCCT 2100 

CCCGGGAGCT AGCCAGCTTC AGAAAGCACA TACAGCATCC TTGCTGCCAA AC CROCTAT G 2160 

TGCACACAGG ATTTCCTTAA TGGCTTAATA AACTGTTATA AAGAACTCCT TCACTTGrTCA 2220 

GAATAAAATA GCTGCCAGGG GCTCTGCACA ATGAGCCTCT TACOGTTAAA AAAAAAAAAA 22B0 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 

Seq ID NO: 327 Protein sequence-. 
Protein Accession ii BAB8507S.1 

1 11 21 31 41 51 

I ) I I 1 > 

MAYYQEPSVE TSriKFKDQO PTTLSDHCLS MGRTPKDETP PAADSSIGQK LLQEKRLSNV 60 

IWKRPQDLPG GPPHPIU3DI SRFDIQQGGA ADCWFLAALG SLTQNPQYRQ KILKVQSPSH 120 

QYAGIPRFRF WQCXJQMVEW IDDRLPVQGD KCLFVRPSHQ SQEFWPCUiE KAYAKXiUSSY 180 

SDLKYGPLED ALVDLTGGVI TNIHLHSSPV DLVKAVJCTAT KAQSLITCAT PSGPTDTAQA 240 

MENGLVSLHA YTVTGAEQIQ VRRGWEEIIS :,WNPWGWGET EWRGRWSDGS QEWEETCDPR 300 

KSQLHKKRED GEFWMSCQDP QQKPIAMPIC SEIPITIiDHG NTIiHEGMSQI MFRKQVIU5M 360 

TAGGPRNDAQ FNPSVQBPMB GTNVWCVTV AVTPSNLKAB DAKFPLDFQV ILAGSQKHCP 420 
KLK 



Seq ID KO: 328 DNA sequence 

Nucleic Acid Accession S: BC0n490.1 

Coding sequence: 74-2788 

1 11 21 31 41 SI 

GTGGGTCACG TGAACCACTT TTCGCGOSAA ACCTGGTTGT TCCTGTAGTG GCGGAGftGGA 60 

TCGTGGTACT GCTATGGCGG AATCATCGGA ATCCTTCACC ATGGCATCCA GCCCGGCCCA 120 

GCGTOXSCGA GGCAATGATC CTCTCACXTTC CAGCCCTGGC CGAAGCTCCC GGCGTACTGA 180 

TGCCCTCACC TCCAGCKCTG GCCX3TGACCT TCCACCATTT GAGGATGAGT CCGAGGGGCT 240 

CCTAGGCACA GAGGGGCCCC TGGAGGAAGA AGftGGATGGA GAGGAGCTCA TTGGAGATGG 300 

CATGGAAAQG GACTACCGOG CCATCCCSGA GCTOQACGCC TATGAGGCOG AGGGACTGGC 360 

TCTGGATGAT GAGGACGTAG AGGAGCTGAC GGCCAGTCAG AGGGAGGCAG CAGAGCGGGC 420 

CATGOGGCAG OGTGACOGGG AGGCTGGCCG GGGCCTGGGC CGCATGCGCC GTGGGCTCCT 480 

GTATGACAGC GATGAGGAGG ACGAGGAGCG CCCTGCCCGC AAGOGCCGCC AGGTGGAGCG 540 

GGGCAOGGAG GACGGOGAGG AGGACGAGGA GATGATCGAG AGCATCGAGA ACCTGGAGGA 600 

TCTCAAAGGC CACTCTGTGC GCGAGTGGGT GAGCATGGCG GGCCCCCGGC TGGAGATCCA 660 

CCACCX3CTTC AAGAACTTCC TGCGCACTCA GGTCGACAGC CAOSGCCACA ACGTCTTCAA 720 

GGAGCGCATC AGOGACATGT GCAAASAGAA OCXSTGAGAGC CTGGTGGTGA ACTATGAGGA 780 

CTTGGCAGCC AGGGAGCACG TGCTGGCCTA CTTCCTGCXrr GAGGCAOCGG OGGAGCTGCT 840 

GCAGATCTTT GATGAGGCTG CCCTGGAGGT GGTACTGGCC ATGTACCCCA AGTAOGACCG 900 

CATCACCAAC CACATCCATG TCCGCATCTC CCACCTGCCT CTGGTGGAGG AGCTGCX^CTC 960 

GCTGAGGCAG CTGCATCTGA ACCAGCTGAT CCX3CACCAGT GGGGTGGTGA CX^VGCTGCAC 1020 

TGGOGTCCTG CCCCAGCTCA GCATGGTCAA GTACAACTGC AACAAGTGCA ATTTCGTCCT 1080 

GGGTCCTTTC TGCCAGTOCC AGAACCAGGA GGTGAAACCA GGCTCCTGTC CTGAGTGCCA 1140 

GTC3GGCCGGC CCCTTTGAGG TCAACATGGA GGAGACCATC TATCAGAACT ACCAGOGTAT 1200 

CCGAATCCAG GAGAGTCCAG GCAAAGTGGC GGCTGGCCGG CTGCCCCGCT CCAAGGACGC 1260 

CATTCTCCTC GCAGATCTGG TGGACAGCTG CAAGCCAGGA GACGAGATAG AGCT GACTG G 1320 

CATCTATCAC AACAACTATG ATGGCTCCCT CAACACTGCC AATGGCTTCC CTGTCTTTGC 1380 

CACTGTCATC CTACCCAACC ACGTGGCCAA GAAGGACAAC AAGGTTGCTG TAGGGGAACT 1440 

GACOGATGAA GATGTGAAGA TGATCACTAG CXTCTCCAAG GATCAGCAGA TOGGAGAGAA 1500 

GATCTTTGCC AGCATTGCTC CTTCCATCTA TGGTCATGAA GACATCAAGA GAQ8CCTGGC 1560 

TCTGGCCCTG TTCCGAGGGG AGCCCAAAAA CCCAGGTGGC AAGCACAAGG TACGTXKTTGA 1620 

TATCAACGTG CTCTTGTGOG GAGACCCTGG CACAGCGAAG TCGCAGTTTC TCAAGTATAT 1680 

TGAGAAAGTG TCCAGCOGAG CCATCTTCAC CACTGGCCAG GGGGCGTOGG CTGTGGGCCT 1740 

CACGGOGTAT GTCCAGCGGC ACCCTGTCAG CAGGGAGTGG ACCTTGGAGG CTGGGGCCCT 1800 

GCTTCTGGCT GACCGAG6AG TGTGTCTCAT T6ATGAATTT GACAA6ATGA ATGACCAGGA 1860 

CAGAACCAGC ATCCATGAGG CCATGGAGCA ACAGAGCATC TCCATCTCGA AGGCTGGCAT 1920 

CGTCACCTCX: CTGCAGGCTC GCTGCACGGT CATTGCTGCC GCCAACCCCA TAGGAGGGOS 1980 

CTACGACCCC TCGCTGACTT TCTCTGAGAA OGTGGACCTC ACAGAGCCCA TCATCTCAC3Q 2040 

CTTTGACATC CTGTGTGTGG TGAGGGACAC OGTGGACCCA GTCCAGGACG AGATGCTGGC 2100 

CCGCTTCGTG GTGGGCAGCC ACGTCAGACA CCACCCCAGC AACAAGGAGG AGGAGGGGCT 2160 

GGCCAATGGC AGCGCTGCTG AGCCCGCCAT GCCCAACACG TATGGCGTGG AGCCCCTGCC 2220 

CCAGGAGGTC CTGAAGAAGT ACATCATCTA CX3CCAAGGAG AGGGTCCACX: CGAAGCTCAA 2280 

CCAGATGGAC CAGGACAAGG TGGOCAAGAT GTACAGTGAC CTGA GGAA AG AATCTATGGC 2340 

GACAGGCAGC ATCCCCATTA CGGTGOGGCA CATOSAGTCC ATGATCOSCA TGGGOGAGGC 2400 

CCACGCGCGC ATCCATCTGC GGGACTATGT GATOGAAGAC GACGTCAACA TGGCCATCCG 2460 

CGTCATCCTO GAGAGCTTCA TAGACACACA GAAGTTCAGC GTCATGCGCA GCATGOSCAA 2S20 

GACTTTTGCC OGCTACCTTT CATTCCGGCG TGACAACAAT GAGCTGTTGC TCTTCA*EACT 2580 

GAACCAGTTA GTGGCA6AGC AGGTGACATA TCACOGCAAC OSCTTTGGGG GCCAGCAGGA 2640 

CACTATTGAG GTCCCTGAGA AlGGACTrGGT GGATAAGGCP OGTCRfiATCA ACATCCACaA 2700 



309 



10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 

85 



WO 02/086443 

CCTCTCTGCA TTTTATGACA GTGAGCTCTT 
AAGGAAAATG ATOCTGCAGC AG TTCTGAGG 
TTCTGGTTTG GGGIGGTCAC TGCCCTCTGT 
TGAACTCGGG GTACTAGGGT CAGGGCTT ftT 
TGTTTGTTTC TCCAAGCCTG CTTTGTGCTT 
TGTCTTACTT GGTTGCTGAA CATCTTGCCA 
TTOGATCAGA. GCTGCTGflGT TC AGGA.TGOC 
ATGGATOTCA G6AGAGCTGC TGCOCTCTTG 
TGCCTTTGGC CAGAGAGCTG GTTGAAGATG 
TCTGTGCCCC TGTGGTGGAA GAGGGCACGA 
CTCGCAGGGG TGGGATGTGA GTCATGCGGA 
TTGCTCCCTG TCTGTTTCCC CACTCTCTTA 
TAATTTTTAA TAAAGTTGAA TAAAATATAA 

Seg ID NO; 329 Protein sequence: 
protein Accession fis AAH17490.1 



CAGGATGAAC 
O CCTAT GCCA 
GCTTTAtQGA 
AGCAGGATGT 
CTCAOCTTTG 
CCTCOGAGTG 
TGOGTGTGGT 
GOGTGAGTTG 
TTTGTAATOG 
CAGTGCCAGC 
TTATCCACTC 
TTTGTGCATT 
AAAAAAAAAA 



AAGTTCAGCC 
TCCATAAGGA 
CACAAAACCA 
CTG6CTGCAC 
GGTGGGAIGC 
CTTTGTCTCC 
TTAGGTGTTA 
CGTATTCAGG 
TTTTCAGTCT 
GCAGCGTTCT 
GCCACAGTTA 
CGGTTTGGTT 
AAAAAA 



ACGACCTGAA 
TrCCTTGGGR 
GA6CACTTGA 
CTGGCATGAC 
CTTGCCAGTG 
ACTCAGTACC 
GCC TTCTTA C 
CTGCTTTTGC 
CCTGCAGGTT 
GGGCTCCTCA 
TdGCTGCCA 
TCTGTAGTTT 



2760 
2820 
28B0 
2940 
3000 
3060 
3X20 
3180 
3240 
3300 
3360 
3420 



PCTAJS02/12476 



KAESSESFTM 
GPLEEEEDGE 
DREAGRGLGR 
SVREWVSMAG 
EHVLAYPLPE 
HLHQIiIRTSG 
TEVtlMEETIY 
HYDGSLNTAN 
I/^SIYGHED 
SRAIPTTGQG 
HHAMEQQSIS 
CWRDTVDPV 
KKYIIYAKER 
HLRDYVIBDD 
AEQVTYQRNR 
LQQF 



11 

I 

ASSPAQRRRG 
ELIGDGMERD 
MRRGLLYSSD 
PRLEIHHRFK 
APAELUQIPD 
WTSCTGVLP 
CNYQRIRIQB 
GFPVFATVIL 
IKRG3jALAI.F 
ASAVGLTAYV 
ISKAGIVTSL 
QDEMIARPW 
VHPKLNQMDQ 
VMMAIRVMIiE 
FGAQQOTIEV 



MDPI.TSSPGR 
YRAIPEU>AY 



NFLRTHVDSH 
EAAIiE\A^U\M 
QLSMVKWOI 
SPGKVAAGRL 
ANHVAKKDNK 
GGEPKMPGGK 
QRHPVSHEHT 
QARCTVIAAA 
GSHVRHHPSU 
DKVAKMYSDL 
SFIDTQKPSV 
PEKDLVDKAR 



31 

1 

SSRRTDALTS 
EAEQLAIJ3DB 
RRQfVERATED 
GKNVFKERIS 
YPKYDRITNH 
KCNFVLGPPC 
PRSKDAILIiA 
VAVGELTDED 
H3CVRGDINVL 
LEA6ALVLAD 
NPIGGRYDFS 
KEEEGLAKGS 
RKESMATGSI 
MRSMRKTFAR 



41 

I 

SPGROLPPFE 
OVEELTASQR 
GEEOEEMIES 
DMCKEKRESL 
IHVRISHLPL 
QSQNQEVKPG 
DLVDSCKPGD 
VKMITSLSKD 
LCGDPGTAKS 
RGVCLXOBFD 
LTPSENVDliT 
AABPAMPNTY 
PITVRHIESM 
YIiSFRRDNNE 
YPSELFRMKK 



51 
1 

OSSEGLLGTS 
BAAERAMRQR 
IE!n*EDLKGH 
WHYEDLAAR 
VESLRSLRQL 
SCPBOQSAGP 
BIBLTGIYHN 
QQIGEKIFAS 
QFLKYIEKVS 
KHNDQDRTSZ 
EPXISRFDZL 
GVEPLPQEVL 
IRMAEAHARI 
LLLFIXiKQLV 
PSHDLKRKMI 



Seq ID NO: 330 DNA sequence 
Nucleic Acid Accession #: M17254 
Coding sequence: 257-1645 



GTCCGCGCGT 
CGGGTCGCAC 
CTTTGGAGAC 
AAAGAT6GCA 
TGGCTTACTG 
CTTATCAGTT 
GGCTAAGACA 
CCCACGCGTC 
GGAATGTAAC 
CAAAGGCGGG 
GGAGGAGAAG 
AGCAGATCCT 
AGAATATGGC 
GTGCAAGATG 
TCTCTCACAT 
TGATAAAGCC 
TGAGCCCCCX: 
TGCTCAACCA 
TTATCAGATT 
GCTTTGGCAG 
GGAAGGCACC 
AGAOCGGAAG 
CTATGACAAG 
CCAOGOGATC 
CTCAGACCTC 
GCCCCACCCT 
CTGGAATTCA 
TTCTCATCTG 
CaCCAGCCCA 
TGAAAAAA6C 
GAiGGGAGTTA 
GGACATATCA 
AAGGACAAAG 
AATCCCACTA 
AACATACCGT 
TCAAAAACAA 
ACTGCATGGC 
CAGCTTTCTC 
ACtATGAACr 
AGGGGTGAAG 
TCrCAAGCAA 
ACTOGAGGGT 
GIAATGGAGA 
TCTCAAATGA 
TCATTATGTG 



11 
i 

GTCCGCGCCC 
TAACTCCCTC 
CCGAGGAAAG 
GAACCAAGGG 
AAGGACATGA 
GT6A6TGAGG 
GAGATGACCG 
CCTCAGCAGG 
CCTAGCCAGG 
AAGATGGTGG 
CACATGCCAC 
ACGCTATGGA 
CTTCCAGACG 
ACCAAGGACG 
CTCCACTACC 
TTACAAAACT 
AGGAGATCAG 
TCTCCTTCCA 
CTTGGACCAA 
TTCCTCCTGG 
AACGGGGAGT 
AGCAAACCCA 
AACATCATGA 
GCCCAGGCCC 
COrrACATGG 
CCAGCCCTCC 
CCAACTGGGG 
GGCACTTACT 
TOGCCACAAA 
TTTACTGGGG 
CTGAAOTCTT 
TCTGTGGACT 
TGCCAAAGAA 
ATGCAAACTG 
TTATAATGCC 
GAGAAAACAC 
ATGTGCTGTT 
AAACTGTGAA 
AAAAGGTGGG 
AACGAGGAGG 
TGAAGACTGG 
TCATGCAGTC 
AAGGGAAGTA 
AAATTTTAAC 
OGGGCTTTGT 




ATTGGCTGTC 
TGAATGGCTC 
GCAGCCCAGA 
CCCCAAACAT 
GTACAGACCA 
TCAACATCTT 
ACTTCCAGAG 
TCAGAGAGAC 
CTCCACGGTT 
CCTQ6ACCGG 
CAGTGCCCAA 
CAAGTAGCCG 
AGCTOCTGTC 
TCAAGATGAC 
ACATGAACTA 
CCAAGGTCCA 
TCCAGCCCCA 
GCTCCTATCA 
GCGTGACATC 
GTATATACCC 
ACTAAAGACC 
CTCTATCGGA 
CTOGGGAAGG 
ACTACAGAAA 
GACCTTGIAA 
AGTGGTCTTA 
GGATGAAACr 
ATTTTAAGGA 
ACGAGAGAGA 
TTGGTTGAAA 
GATGACCCAA 
ACTGAGGATG 
AAGAGGCAGA 
ACrCAGGACA 
AGTGTTATAC 
GTAGAATTCA 
TGGAATTGTC 
TCTCCACA06 



31 
I 

G0GC6CGTGC 
GCGGCGCTAA 
CAAAAGCAAG 
CGTCAGGTTC 
CCCGG ACCCA 
GTTTGAGTGT 
CAGCGACTAT 
TCAACCCCCA 
AAGGAACTCT 
CACCGTTGGG 
GACCACGAAC 
TGTGCGGCAG 
GTTATTCCAG 
GCTCACCCCC 
TCCTCTTCCA 
AATGCATGCT 
TCACGGCCAC 
AACTGAAGAC 
CCTTGCAAAT 
GGACAGCTCC 
GGATCCOGAC 
CGATAAGCTC 
TGGGAAGCGC 
CXCCCOGGAG 
CGCC CACCCA 
TTCCAGTTTT 
CAACACTAGG 
TGGCGGAGGC 
GAACATGAAT 
AAGCCXSGGGA 
TGAGGAGGAT 
AAGACAGT6T 
AGAAATGTAT 
AAAGCAATAG 
AAACTACXTTG 
CTGTGGCCCA 
TCAAATACAT 
AGTTTCCAAC 
tCTATAGAGT 
6AAG6AGGAG 
TrrGGGGACT 
CAAACCCAGT 
GAAACAAAAA 
TGATATTTAA 
GTCAGGTAAG 



41 
I 

CTTGGC06TG 
CCTCTOGGTT 
ACAAATGACT 
TGAACAGCTG 
GCAGCrCATA 
GCCTACGGAA 
GGACA6ACTT 
GCX3VGGGTCA 
CCTGATGAAT 
ATGAACTAOC 
GAGCGCAGAG 
TGGCTGGAGT 
AACAT0GAT6 
AGCTACAAOG 
CATTTGACTT 
AGAAACACAG 
CCCACGCCCC 
CAGOGTCCTC 
CCAGGCAGTG 
AACrcCAGCT 
GAGGTGGCGC 
AGCCX3CGCCC 
TACGCCTACA 
TCATCTCTGT 
CAGAAGATGA 
TTtGCTGCCC 
CTCOCCACCA 
TTTTCCCATC 
CAAAAGTGCC 
AGAGATCCAA 
GCTAAAAATG 
ATGTAGAAGC 
AAACTTTAGA 
AAACAACACA 
TATTTAAAAA 
TCAACAGAOG 
TCCGTTTGAT 
TCCTTTACAG 
GAG06TGTGA 
ACCAOGCTOG 
GTGTACAATG 
GTTAGGAGAA 
TGOGCATCTC 
GAGAAACAT7 
AGATQGOCTT 



60 
120 
IBO 
240 
300 
360 
420 
480 
S40 
600 
660 
720 
780 
840 
900 



51 
I 

OGOGOOGAGC 
ATTCCAGGAT 
CACAGAGAAA 
GTAGATGGGC 
TCAAGGAAGC 
GGCCACACCT 
CCAAGATGAG 
OCATGAAAAT 
GCAGTGTGGC 
GCAGCTACAT 
TTATCGTGCC 
GGGOGGTGAA 
G6AA06AACT 
OC3GACATCCT 
CAGATGATGT 
ATTTACCATA 
AGTOGAAAGC 
AGTTAGATCC 
GCCAGATCCA 
GCATCAOCTG 
G6CGCT660G 
TCOGTTACTA 
AGTTCGACTT 
ACAAGTACCC 
ACTTTGTGGC 
CAAAOCCATA 
GCCATATGCC 
AGCGTGCATT 
TCAAGAGGAA 
AGACTCTTGG 
TCACGAATAT 
ATGAAGTCTT 
GTAGAGTTTG 
GTTTTGACCT 
TAGTTTCATA 
TTGATATGCA 
GGACAGCTGT 
TATTACCX5GG 
TTGTA6ACAG 
GAAAGAAACT 
AGTTATGGAG 
AGGACACAGC 
• ITi ' C l' ri ' CilT 
CAG6ACCTCA 
CTTGGCTGCX; 



60 

120 

IBC 

240 

300 

360 

420 

480 

S40 

600 

660 

720 

780 

840 

900 

960 
1020 
1080 
1140 
1200 
1360 
1320 
1380 
1440 
ISOO 
1S60 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 ' 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2S20 
2580 
2640 
2700 



310 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

ACAATCAGAA ATCACGCAGG CATTTTGGGT AGGOQGCCTC 
AACGCTGTGC GTTTGTCAGA ATGAAGTATA CAAG TCAA TG 
ATAATTATAT AACTTATGCA TTTATACACT AOGA G TPGAT 
CGACAAAAGA GACAATCGAT ATAATGTGGC CTTGAATTTT 
TAOUVTATGA AfiTTATTAGT TCTTAGAATG CAGAATGTAT 
TAGCATGGCA AATCAGATTT ATACAGGAGT CTGCATTTGC 
TTGCTTAATG AAAACATGTG CTGAATGTTG TGCSATrTTGT 
GGAACTTCTG CAAGQC3«iGAG CCAAGGAAAT AGGATGTTTG 

Seq ID NO: 331 Protein sequence 
protein Accession #: AAA52398 



PCT/US02/12476 



CACTTTTCCT 
TTTTTCCCCC 
CTCGGCCAGC 
AACTCTGTAT 
GTAATAAAAT 
ACTTTTTTTA 
GTTATAATTT 
GCACCC 



1 
I 

MIQTVPDPAA 
QOWLSQPPAR 
PPPNMTTIIER 
DOFQRLTPSY 
SAWTGHGHPT 
LELLSDSSNS 
MTKVHGKRYA 
LPVTSSSFFA 



11 

I 

HIKEALSWS 
VTIKMECNPS 
RVIVPADPTL. 
KADILLSHLH 
PQ5KAAQPSP 
SCITWEGTNG 
YKFDFHGIAQ 
APMPYWNSPT 



21 
I 

EDQSLFSCAY 
QVNGSRNSPD 
HSTDUVRQKL 
YZAETPLPHL 
STVPKTEDQR 
EFKMTDPDEV 
AliQPHPPESS 
GGIYPNTRLP 



31 
I 

GTPHLAKTEM 
ECSVAK6GKM 
EWAVKEYGI^P 
TSDOVDKALQ 
POLOPYQILG 
ARRKGERKSK 
LYKYPSOLPY 
TSHMPSRLGT 



41 

1 

TASSSSOYGQ 
VGSPOTVGMS 
DVNILLFQNI 
NSPRUfHARM 
PTSSRLANPG 
PNMNYOKLSR 
KGSYBAHFOK 
YY 



TTGAGTOGOG 
TTTTTATATA 
CAAAGACACA 
GCTTAATGTT 
AAGCTTGGCC 
GTGACTAAAG 
ACTTTGTCCA 



51 
I 

TSKMSPRVPQ 
YGSYMEEKHM 
DGKELCKMTK 
TDLPYEPPRR 
SGQIQLKQFL 
ALRYYYDKNI 
MNFVAPHPPA 



2760 
2820 
2880 
2940 
3000 
3060 
3120 



Seq ID NO: 332 DNA sequence 
Nucleic Acid Accession fts NMj000020 
Coding sequence: 283-1794 



AGGAAAOGGT 
AGAAACATTT 
GAGOGAGCCC 
CCAGCGCTGG 
AGGCTAGCGC 
AGGAAAGGCC 
TCT0C3GGGCC 
GGGGGGGCCT 
OGGGGCTG06 
CACTACTGCT 
CAACCTCCTT 
CTGGCCTTGC 
CAGGAGAAGC 
TCTGAGCAGG 
GGCTCAGGGC 
TGTGTGGGAA 
GCCX3TCAAGA 
AACACAGTAT 
CGCAACTCGA 
GACTTTCTGC 
GCATGCGGCC 
GCCCACCGCG 
GG06ACCTGG 
AACCCGAGAG 
ACGGACTGCT 
GAGATTGCCC 
GATGTGGT6C 
CAGACCCXTCA 
ATGATGCGG6 
AAGACACTAC 
AGCACCTGAT 
CTATCTGGGT 
TGCTOGGCCC 
GTCTGGCCTG 
CAGCATGGTG 
GTGCCAAGCC 
CCCTTGATCA 
CCCTGGCACA 
CCCATCAGTT 
TCCTCAACAA 
ACTAGGGCAT 
AAAAGGGCAG 
GCCAAGCATG 
TTTGCTCCAT 
TTTTTTTTTT 
CCAGCTCACC 
GTAGCTGGGA 
CAGGGTTTCA 
ACCTCAGCCT 
TTGTTTCTTA 
CTAGTTCTCT 
ATGCTCCAGC 
CAAGGAGTGT 
CATGCCAGTG 
CTCGCCCTCT 
GCTTCCAAGG 
CXXrrGGCTTC 
ATGGGCTCTA 



11 
I 

TTATTAGGAG 
TTGCTCCAGC 
CTCCCOGGCT 
CGGTGCAACT 
CCCGCCACCC 
TTCTGATGCT 
CGCTGGTGAC 
GGTGCACAGT 
GGAACTTGCA 
GCGACAGCCA 
CGGAGCAGCC 
TGGCCCTGGT 
AGOGTGGCCT 
GCGACACGAT 

TCCccrrccT 

AAGGCCGCTA 
TCTTCTCCTC 
TGCTCAGACA 
GCACGCAGCT 
AGAGACAGAC 
TGGCGCACCT 
ACTTCAAGAG 
GCCTGGCTGT 
TGGGCACCAA 
TTGAGTCCTA 
GCCGGACCAT 
CCAATGACCC 
CCATCXICTAA 
AGTGCTGGTA 
AAAAAATTAG 
TCCTTTCTGC 
AGAGGTAGTG 
CCAGCCCACC 
CTCAAAGCGG 
CACCCCCTAC 
AGGGAATCX:C 
ACCCCACTGC 
CACTTCCCTG 
TCTCTCTGTG 
GAGTGCAGCT 
TAAATCCTAA 
GTCAGATGG6 
GCAGGGGGAA 
GTGACAAAAG 
GACACGGAGT 
GCAACGTCTA 
TTACAGGCAC 
CCATGCTGGC 
CCCAAAGTGC 
TCTACATATT 
GACACTTCAG 
CCCTGGCAAT 
CTGGAGCACC 
GCCACCCTTG 
CTGTGGCATA 
CTCAAAAGAA 
AGGCCCACAC 
GAGAGACACA 



21 
1 

GGAGTGGTGG 
OCCCATCCCA 
CCAGCCOGGT 
GCGGCCGCGC 
GCAGAGCGGG 
GCTGATGGCC 
CTGCACGTGT 
AGTGCTGGTG 
CAGGGAGCTC 
CCTCTGCAAC 
GGGAACAGAT 
GGCCCTGGGT 
GCACAGCGAG 
GTTGGGGGAC 
GGTGCAGAGG 
TGGGGAAGTG 
GAGGGATGAA 
CGACAACATC 
GTGGCTCATC 
GCTGGAGCCC 
GCACGTGGAG 
CCGCAATGTG 
GATGCACTCA 
GCGGTACATG 
CAAGTGGACT 
CGTGAATGGC 
CAGCTTTGAG 
CCGGCTGGCT 
CCCAAACCCC 
CAACAGTCCA 
CTGCAGGGGG 
TGAGTGTGGT 
CAGCCAAAAA 
CAGGCTCCCT 
CACTCCCGGG 
AGTCCCAGAC 
CCCACCAGAG 
CCAGGCCTCA 
GATTTGTATC 
TGCTGAATGT 
GAGGTCCTAC 
CAAGGCCCA6 
GGTCAGTGGG 
CAGGCCTGTC 
TTCGCTCTTG 
CCTCCCAGGT 
ATGCCACCAT 
CATGCTGGTT 
TGGGGTTACA 
GGAAGATTTG 
CCTATATCAC 
TTGCCTCAAG 
TCCTAGTCTA 
GGCTCAGACA 
GTCTTCTCTG 
ATTTGGCTCC 
CCCTGGGCCA 
CAGAAAGTTT 



31 
I 

AGCTGGGCCA 
GTCCCGGGAG 
CCGGGGCCGC 
GGTGGAGGGG 
CCCAGAGGGA 
TTGGTGACCC 
GAGAGCCCAC 
CGGGAGGAGG 
TGCAGGGGGC 
CACAACGTGT 
GGCCAGCTGG 
GTCCTGGGCC 
CTGGGAGAGT 
CTCCTGGACA 
ACAGTGGCAC 
TGGCGGGGCT 
CAGTCCTGGT 
CTAGGCTTCA 
ACGCACTACC 
CATCTGGCTC 
ATCTTCGGTA 
CTGGTCAAGA 



GCACCCGAGG 
GACATCTGGG 
ATCGTGGAGG 
GACAT6AAGA 
GCAGACCCGG 
TCTGCCCGAC 
GAGAAGCCTA 
CTGG6GGGGT 
GTGTGCTGGG 
TACAGCTGGG 
GACGCCTGGC 
ACAGGATGCA 
TCAGAGCCCG 
CTGCCAGGGT 
GCCTCTAGCA 
TCAGCTCCAT 
CAGCTGCCTG 
TGAGGTGTGG 
GACTTTCAGA 
TGTCAAGAGA 
TCAGGACCTT 
TTGTCCAGGC 
TCAAATCATT 
GCCTGGCTAA 
CTOGAACTCC 
GGTGTGAGCC 
GTCCTGATGT 
AGCTAACTTC 
ATOGGGGTTT 
AGTCPGCAAG 
GCTCTGGGCC 
CXXCAGGACT 
ATCCAAGAAG 
GGSCCAGAGA 
GGGCATTTGG 



41 

1 

GGCAGGAAGA 
GCTGCCGCGC 
GCCGGACCCC 
AGGTGGCCCC 
CCATGACCTT 
AGGGAGACCC 
ATTGCAAGGG 
GGAGGCACCC 
GCCCCACCGA 
CCCTGGTGCT 
CCCTCATCCT 
TGTGGCATGT 
CCAGTCTCAT 
GTGACTGCAC 
GGCAGGTTGC 
TGTGGCAOGG 
TCCGGGAGAC 
TCGGCTCAGA 
ACGAGCACGG 
TGAGGCTAGC 
CACAGGGCAA 
GCAACCTGCA 
ATTACCTGGA 
TGCTGGACGA 

ACTATAGACC 
AGGTGGTGTG 
TCCTCTCAGG 
TCACCGCGCT 
AAGTGATTCA 
GGGGGGCAGT 
GATGGGCAGC 
CTGAAACCTG 
TCTCTCCCCA 
AAAGAGGCTC 
G6CCTGCACT 
GGCACAGGGC 
TAAGCTCCAG 
GATGCCTTGG 
AGAGAGCTGG 
CAGGATCACA 
TTAACTGAGA 
0CCAG6TCTG 

TAGAGTGCAA 
CTCTTGCCTC 
TTTTGTATAT 
TGACCTCAGG 
ATOGOGCCTG 
CCTTTGAGGC 
YTCAGTCTCA 
GAAAATAACT 
CTCCAGTTCT 
TTTTGACCAC 
GCAGGGOGGC 
GCTCCAGCTC 
GTGTGTCTCA 
GAAATTTTCA 



60 
120 
180 
240 
300 
360 
420 
462 



51 

I 

CGCTGGAATA 
CAGCTGCGCC 
AGCCCGCCGT 
GGTCOGCOGA 
GGGCTCCCCC 
TGTGAAGCG6 
GCCTACCTGC 
CCAGGAACAT 
GTTOGTCAAC 
GGAGGGCAOC 
GGGCCCCGTG 
CCGACGGAGG 
CCTGAAAGCA 
CACAGGGAGT 
CTTGGTGGAG 
TGAGAGTGTG 
TGAGATCTAT 
CATGACCTCC 
CTCCCTCTAC 
TGTGTCCGOG 
ACCAGCCATT 
GTGTTGCATC 
CATOGGCAAC 
GCAGATCCGC 
GGTGCTGTGG 
ACCXTTTCTAT 
TGTGGATCAG 
CCTAGCTCAG 
GCGGATCAAG 
ATAGCX:CAGG 
GGATGGTGGC 
TGCGCCTGCC 
ATCCCCTGCT 
CCCCTATGGC 
CAGAGTCAGA 
TTOCCCCCTG 
CCTGTCCAGC 
AGAGCCAGGG 
GCTTTCTGTC 
GGCCTGACTT 
GGCCAGTGGA 
GGATATOGAG 
ACCCCGGATG 
TTTTOCTTCT 
TGGCATGATC 
AGACTCCCGA 
TTAGTAGAAA 
TGTTCCACCT 
GCCAOGACCT 
TTCTTTAGCT 
TCTATTCCTT 
TTACCTGACT 
TGCCTAAAAC 
AAGCCAGCCC 
TTCCTCCAAG 
CCCTACTGGC 
GGAGAATTCA 
AG(31TGTATG 



60 
120 

leo 

240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
ISOO 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
22B0 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2620 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
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TAIGGYTCAC GTATGGWGCA GGTTGTCCTG CTCCYKGGGT GCAGGGAAGT GGCCTGCAGG 3540 

GAAGTGGATT GGAGGGGAGC TTGAGGAATA TAAGGAGOGG GGGTGCAfiAC TCAGGCTATG 3600 

GACAAGCavCA GCCOOVAGGT TGGGAAGACC TGGOCTTAGT OGXCCTCAGC CTAGGGCAGG 3660 

GCAGTGAACA AAGCTCTCCC CGCTCCTGCT GTAATCACCC AGACIW3CCT CCCCW3GCCG 3720 

GCATCTTATG TGTGTCTTCC ACCATCCTCA TGGTGGCACT TTTCTAGGCC TGTCTCCCAG 3780 

CATTGTGCAA GGCTCGGAAG AGAACCAGGA AGTGAAACTG GGTGAAAACA GAAAG CTCAA 3840 

TOGATGGGCT AGGTTCCCAC ATCATTAGGG CAGAGTTTGC ACGTCCTCTG GTTCACTGQG 3900 

AATCCaCCCA GOCCAOGAAT CATCTCCCTC TTTGAAGGAT TTTOATTTCT ACTGGGTTTT 3960 

GGAACAAACT CCTGC TO AGA CCCCACfiGCC AGAAACTGAA AGCAGCAGCT CCCCAAAGCC 4020 

TCGAAAATCC CTAAGAGAAG GCCTGGGGGA MAGGAAKTGG AGTCACAGGG GACAGGTAGA 4080 

GAGAAGGGGG CCC3UVTGGCC ACGGAtSTGAA GGACGTGGCG TTGCIG AGAG CAGTrC TGCAC 4140 

ATGCTTCTGT CTGAGTGCAG GAAGGTGTTC CAGGGTCGAA ATTACACTTC TOGIACCTOG 4200 

AGAOGCTCTT TGTCGGAGCA CIGGGCTCAT GCCTGGCACA CAATAGGTCT GCAATAAACC 4260 
AXGGTTAAAT CCTGAAAAAA AAAAAAAAA 

Seq ID NO: 333 Protein sequence 
Protein Accession «: HP_0000ll 



1 11 21 31 41 51 

JlTLGSPRKGL LMLU4ALVTQ GDPVKPSRGP LVTCTCBSPH CKGPTCRGAW CTWLVREEG 60 

RHPQEHRGCG NLKRELCRGR PTEFVMHVCC DSHLCNHNVS LVLEATQPPS BQPGTDGQIA 120 

LILGPVLALL ALVALGVLGL WHVRRRQEKQ RGLHSELGES SLZLKASECyS OTOWSDLLDS 180 

DCTTGSGSGL PFLVQR1VAR QVAIjVECVGK GRYGBVWRGL MHGESVAVKI PSSRDBQSMP 240 

RBTEIYNTVIi LRHDNIUSFI ASDMTSRNSS TQLWLITHYH EHGSLYDFLQ RQTLEPHLAL 300 

RLAVSAACGL AHLHVEIPGT QGKPAIAHHD FKSRNVLVKS NLQCCIADLG LAVMHSQGSD 360 

YLDIGNNPRV GTfCRYMAPEV LDBQIRTDCP BSYKWTDIWA PGLVLWEIAR RTIVNGIVED 420 

YRPPPYDWP NDPSFBDMKK WCVDQQTPT IPKRLAADPV LSGLAQMMRE CWYPNPSARL 480 
TALRIKKTLQ KISNSPEKPK VIQ 

Seq ID NO: 334 DNA sequence 

Nucleic Acid Accession ft: NK_004126.1 

Cbding sequence: 106-329 

1 11 21-31 41 SI 

GGCACGAGCT CGTGCCGGCC TTCAGTTGTT TCGGGACGCG CCGAGCTTCG CCGCTCTTCC 60 

AGCGGCTCOG CTGCCAGAGC TAGCCCGAGC COGGTTCTGG GGCGAAAATG CCTGCCCTTC 120 

ACATCGAAGA TTTGCCAGAG AAGGAAAAAC TGAAAATGGA AGTTGAGCAG CTTCGCAAAG 180 

AAGTGAAGTT GCAGAGACAA CAAGTGTCTA AATGTTCTGA AGAAATAAAG AA CTATATT G 240 

AAGAACGTTC TGGAGAGGAT CCTCTAGTAA AGGGAATTCC AGAAGACAAG AACCCCTTTA 300 

AAGAAAAAGG CAGCTGTGTT ATTTCATAAA TAACTTGGGA GAAACTGCAT CCTAAGTGGA 360 

AGAACTAGTT TGTTTTAGTT TTCCCAGATA AAACCAACAT GCTTTTTAAG GAAGGAAGAA 420 

TCAAATTAAA AGGAGACTTT CTTAAGCACC ATATAGATAG GGTTATGTAT AAAAGCATAT 480 

GTGCTACTCA TCTTTGCTCA CTATGCAGTC TTTTTTAAGA GAGCAGAGAG TATC AGAT GT 540 

ACAATTATGG AAATAAGAAC ATTACTTGAG CAT6ACACTT CTTTCAGTAT ATTGCTTGAT 600 
GCTTCAAATA AA6TTTTGTC TT 

Seq ID NO: 335 Protein sequence 
Protein Accession ft: NP_004117.1 

1 11 21 31 41 51 

MPALHIEDLP EKEiCLKMBVE QLRKBVKLQR QQVSKCSEEI KNYIBERSGE DPLVKGIPED 60 
KNPFKEKGSC VIS 



Seq ID NO: 336 DNA sequence 
Nucleic Acid Accession ft: NN_00S795 
Coding sequence: 555-1940 

1 11 21 31 41 51 

1 i 1 I I ^ 

GCACGAGGGA ACAACCTCTC TCTCTSCAGC AGAGAGTGTC ACCTCCTGCT TTAGGACCAT 60 

CAAGCrCTGC TAACTGAATC TCATCCTAAT TGCAQGATCA CATTGCAAAG CTTTCACTCT 120 

TTCCCACCTT GCTTGTGGGT AAATCTCTTC TGOGGAATCT CAGAAAGTAA AGTTCCATCC 180 

TGAGAATATT TCACAAAGAA TTTCCTTAAG AGCTGGACTG GGTCTTGACC CCTGQAATTT 240 

AAGAAATTCT TAAAGACAAT GTCAAATATG ATCCAAGAGA AAATGTGATT TGAGTCTGGA 300 

GACAATTGTG CATATCGTCT AATAATAAAA ACCCATACTA GCCTATAGAA AACAATATTT 360 

GAATAATAAA AACCCATACT AGCCTATAGA AAACAATATT TGAAAGATTG CTACCACTAA 420 

AAAGAAAACT ACTACAACTT GACAAGACTG CTGCAAACTT CSU^TTGGTCA CCACAACTTG 480 

ACAACGTTGC TATAAAACAA GATTGCTACA ACTTCTAGTT TATGTTATAC AGCATATTTC 540 

ATTTGGGCTT AATGATGGAG AAAAAGTGTA CCXrPGTATTT TCTGGTTCTC TTGCCTTTTT 600 

TTATGATTCT TGTTACAGCA GAATTAGAAG AGAGTCCTGA GGACTCAATT CAGTTGGGAG 660 

TTACTAGAAA TAAAATCATG ACAGCTCAAT ATGAATGTTA CCAAAAGATT ATGCAAGACC 720 

CCATTCRACA AGCAGAAGGC GTTTACTGCA ACAGAACCTG GGATGGATGG CTCTGCTGGA 780 

ACGATGTTGC AGCAGGAACT GAATCAATGC AGCTCTGCCC TGATTA CTTT CAGGACTTTG 840 

ATCCATCAGA AAAAGTTACA AAGATCTGTG ACCAAGATGG AAACTGGTTT AGACATCCAG 900 

CAAGCAACAG AACATGGACA AATTATACCC AGTGTAATGT TAACACCCAC GAGAAAGTGA 960 

AGACTGOtfrr AAATTTGrrr TAGCTGACCA TAATTGGACA CJGGATTGTCT ATTGCATCAC 1020 

TGCTTATCTC GCTTGGCATA TTCTTTTATT TCAAGAGCCT AAGTTGCCAA AGGATTACCT 1080 

TACACAAAAA TCTGTTCTTC TCATTTGTTT GTAACrCTGT TGTAACAATC ATTCACCTCA 1140 

CTGCAGTGGC CAACAACCAG GCCTTAGTAG CCACAAATCC TGTTAGTTGC AAAGTGTCCC 12 OO 

AGTTCATTCA TCTTTACCTG ATGGGCTGTA ATTACTTTTG GATGCTCTGT GAAOGCATTT 1260 

ACCTACACAC ACTCATTGTG GTGGCOGTGT TTGCA6AGAA GCAACATTTA AT6TGGTATT 1320 

ATTnCTTGG CTGGGGATTT CCACTGATTC CTGCTTGTAT ACATGCCATT GCTAGAAGCT 1380 
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TATATTACAA TGACAATTGC TGGATCaGTT C TGATACC CA 
GCCCAASTTG TCCTGCTTTA CTGGTGAATC TTTTTTTCTT 
TCATCACCAA GTTAAAAGTT AOUSUXAAG OGGAATCCAft 
GAGCTACTCr TATCTTQGTG CCATTGCTTG GCATTGAATT 
CTGAAGGAAA GATTGCAGAC GAGGTATATG ACTiUZATCAT 
AGGGTCTTTT GGTCTCTACC ATTTTCTGCT TCTTTAATGG 
GAAGAAACTG GAATCAATAC AAAATCCAAT TTGGAAACAG 
TTOGTACTGC GTCTTACACA GTIGTCAACAA TCAGIXSATGG 
GTCXTAGTGA ACA.CTTAAAT GGAAAAAGCA TC CATGATA T 
CAGAAAATTT ATATAATTGA AAATAGAAGG ATGGTTCTCT 
AACrCAAGGA CTTGGACCCA TGACTCTGTA GCCACAAGAC 
GGGAATCTCA TAAAGAAGAG CCTTCACATG AA ATTAG TAG 
ATCCAGCTCT ATGTGGGAAA AAAGAAATCC TGGTTTGTAA 
CACTATGCCT GATGTGAOGC TACTAACCTG ACATCACCAA 
ACAATCAACT TTTCTGAGCT GGTGTAAGCC AGTTCCAGCA 
AAATGGCTGT AAAACTAAAC ATACATGTTG 0GCATC31TTC 
GACCTAGCTA AGGTCTATAA ACAT GAAGGG A AAATTAGC T 
TCCCATCTTG ATTGGGGCAG TTGACTTTTT TTTTTTCCCA 
TAACTACCCT CTCAAATGGA CAATACCAGA ACSTTGAATTAT 
CTATGAAAAG CAACTGAGTA CAATTCTTAT GATCTACTCA 
ATCrXGTGGC ATATOCATTG TGGAAACTGG ATGAACAGGA 
TTCTATATCA TTACGAAAAC ATCTTAGTTG ATGCTACAAA 
TGTCTTACCA AACAGTGGGA GGGAATTCCT AGCTGTAAAT 
TCTACTGTAT AAACAAATTA GCAATCATTT TATATAAAGA 
ATTTTCTTGG AATTTTGTAA AAAGAAATTG TGAAAAATGA 
TTTATmAT AGTCTCAAAT CAAATACATA CAACCTATGT 
AATGCAACAA TGTGTGTATG TTAATATCTG ATACTGTATC 
AATAGAGTCT GGAATGCT 

Seg ID KOt 337 protein sequence 
Protein Accession Si NP_0 05786.1 



PCTAJS02/12476 



MEKKCTLYFL 
EGVYCNRTVn) 
WTNYTOaiVN 
PPSFVCNSW 
IWAVFAEKQ 
ALLVNLFFLL 
AEEVYDYIMR 
YTVSTISDGP 



11 
1 

VLLPFFMILV 
GWLCWNDVAA 
THEKVKTALN 
TIIHIiTAVAN 
HLMWYYFLGW 
NIVRVLITKL 
ILMHFQGIiLV 
GYSHDCPSEH 



21 
I 

TAELEBSPED 
GTESMQLCPD 
LPYLTIIGHG 
NQALVATNPV 
GPPLIPACIH 
KVTHQAESNL 
STIFCFPNGE 
I.NGKSIBDZE 



31 
I 

SIQL6VTRNK 
YFQDFDPSEK 
LSIASLLISL 
SCKVSQFIHL 
AIARSLYYND 
YMKAVRATLI 
VQAILRRMUN 
HVLLKFENLY 



TCTCCTCTAC 
GTTAAATATT 
TCTGTACAIG 
TGtGCTGATT 
GGACATCCTT 
AGAGGTTCAA 
CTTTTCCAAC 
TOCAGGTEAT 
TGAAAATGTT 
CACTGTTTGG 
TTCAATATTA 
TX5TGTTGATA 
TGTTTGTCAG 
GTGTGGAATT 
CACCATTGAT 
TAOCCFTATT 
TTTAfflTTTA 
GAGTGCOGTA 
CCCTGCTGGC 
TTTGCTGACA 
TGTATAATAT 
ACACCTTGTC 
ATAAATTTTG 
AAATCAATGA 
GCTTGTAAAT 
AATTTTTAAA 
TGGGCTGATT 



41 
I 

IMTAQYECYQ 
VTKIGDQDGN 
GIFFYFKSLS 
YLMGCNYPMM 
NCWISSDTHIi 
LVPLLGIEFV 
QYKIQFGN5F 



ATTATCCATG 
GTAOGOGTTC 
AAAGCXGTGA 
GCAtGGOGAC 
ATGCACTTCC 
GCAATTCTGA 
TCAGAAGCTC 
AGTCATGACT 
CTCTTAAAAC 
TGCTTCTCCT 
AAT6ACTTTG 
AGAGTGTAAC 
TAAATACTCC 
GGAGAAAAGC 
GAATTCAAAC 
CSCCOCAAGA 
AAACTCTTTA 
GTCCTTTTTG 
TTTCTTTTCT 
CATCAGTTAT 
GCAATCTTAC 
AACCTCTTCX: 
CXXTTCCATT 
AGGATFTCTT 
ACTCXATTAT 
GCAAATATAT 
TTTTAAATAA 



1440 
ISOO 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2S20 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 



Seq ID MOt 338 DNA sequence 
Nucleic Acid Accession NJ4_001795 
Coding sequence: 25-2379 



GCACX3ATCTG 
GCCTGCCTGG 
GGGGACACCC 
CAGATGCACA 
TCAAGCGTGA 
TTC06GGTCX3 
ATCTCAGA6T 
ACTCCTTCCA 
CATCGGTTGT 
GTGACACCAG 
ATCCTGAAGO 
AAAAGCTTGG 
CAGGGCCTCC 
GACAACTTCC 
GTGGGCACCT 
ACCAAGTACA 
GCCCACAACG 
TACAGCTTCA 
GCGGGAAACA 
CAGCAGCCTT 
GTGCTGGCCA 
AGTGACAAGG 
CTGGACA6AG 
ACTGGAACCC 
AATGACAATG 
CATGGCCA6C 
AAGTTCAAAT 
ACGGCCAACA 
CTACCCGTGG 
GTGGCCGTGT 
CAGGTGGGCG 
6TGATCACCC 
AAGAGC6TGC 
ATGGACACCA 
CXXXXXXX3GC 
AGGCACX30GC 
AAGGAOGAGG 
TACXjAGGGCT 



11 

I 

TTCCTCCTGG 
GCCTGCTGGC 
ACAGCCTGCT 
TTGATGAAGA 
GTC6CAAGAA 
AT6CAGAGAC 
ACCACCTCAC 
GCTTCACCAT 
TCAATGCGTC 
7GGATGCAGA 
6SAAAGAGTA 
ACCGAGAGAA 
GGGGGGACTC 
CCTTCTTCAC 
CTGTGGGCTC 
GCATCTTGCG 
AGGGCATCAT 
TOGTCGAGGC 
GAGOCCAGGT 
TCTACCACTT 
TGGACCCTGA 
GCCAGTTCTT 
AAGTCTACCC 
CCACAGGAAA 
CCGOGGAGTT 
TGGTCCTGCA 
TCACCTTGAA 
TCACAGTCAA 
TCATCTCAGA 
GCAAGTGCAA 
TGAGCA7CCA 
TGCTCATCTT 
CGGAGATCCA 
CCAGCTAOGA 
CCGCGCTGGA 
CTGGGGCACA 
CGGACCAOGA 
CCGAGTCCAT 



21 
I 

GAAGATGCAG 
AGTGGCAGCA 
GCCCACCCAC 
GAAAAACACC 
TGCCAAGTAC 
AGGAGACGTG 
TGCTGTCATT 
CAAAGTTCAT 
CGTGCCTGAG 
CGACCCCACT 
TTTTGCCATC 
GCAGGCCAGG 
GGGCACGGCC 
CCAGACCAAG 
TCTGTTTGTT 
GGGC3GACTAC 
CAAGCCCATG 
CACAGACCCC 
CATTATCAAC 
CCAGCTGAAG 
TGCGGCTAGG 
CCGAGTCACA 
CTGGTATAAC 
AGAATCCATT 
TGCCAAGCCC 
GATCTCOGCA 
TACTGAGAAC 
GTATGGGCAG 
CAATGGGATG 
CGAGCAGGGC 
GGGAGTGGTA 
CCTGGGGCGG 
CGAGCAGCTG 
TGTGTCGGTG 
CGCCCGGCCT 
CGGAGGGCCC 
COGOGACGGC 
AGCOGAGTCC 



31 
I 

AGGCTCATGA 
GTGGCAGCAG 
CX3GCGCCAAA 
TCACTTCX:CC 
CTGCTCAAAG 
TTOGCCATTG 
GTGGACAAGG 
GACGTGAACG 
TCGTOGGCTG 
GTGGGAGACC 
GATAATTCTG 
TAT6AGATCG 
ACCX3TGCTGG 
TACACATTTG 
GAGGACCCAG 
CAGGACGCTT 
AAGCCTCTGG 
ACCATCGACC 
ATCACAGATG 
GAAAACCAGA 
CATAGCATTG 
AAAAAGGGGG 
CTGACTGTG6 
GTGCAAGTOC 
TACCAGCCCA 
ATAGACAAGG 
AACTTTACCC 
TTTGACOGGG 
CCAAGTCGCA 
GAGTTCACCT 
GCCATCTTAC 
CGGCTGOGGA 
GTCACCTAOG 
CTCAACTOSG 
TCCCTCTATG 
GGGGAGATGG 
COCCXrCTAOG 
CTCAGCTCOC 



41 

I 

TGCTCCTCGC 
CAGGTGCTAA 
AGAGAGATTG 
ATCATGTAGG 
GAGAATATGT 
AGAGGCTGGA 
ACACTGGTGA 
ACAACTGGCC 
TGGGGACCTC 
ACGCCTCTGT 
GACGTATTAT 
TGGTGGAAGC 
TCACTCTGCA 
TCGTGCCTGA 
ATGAGCCCCA 
TCACCATTGA 
ATTATGAATA 
TOOGATACAT 
TGGAOGAGCC 
AGAAGCCrCT 
GATACTCCAT 
ACATTTACAA 
AGGCCAAAGA 
ACATTGAAGT 
AAGTGTGTGA 
ACATAACACC 
TCAOGGATAA 
AGCATACCAA 
CJGGGCACXZAG 
TCTGCGAGGA 
TCTGCATCCT 
AGCAGGCC06 
ACGAGGAGGG 
TGCGCCGCGG 
CGCAGGTGCA 
CAGCCATGAT 
ACACGCTGCA 
TGGGCACCGA 



SI 
I 

KIMQDPIQQA 
WFRBPASNRT 
OQRITLHKNL 
LCBGIYXiHTL 
LYIIHGPICA 
LIPWRPEGKI 
SNSEALRSAS 



51 
I 

CACATGGGGC 
OCCTGCCCAA 
GATTTGGAAC 
CAAGATCAAG 
GGGCAAGGTC 
COGGGAGAAT 
AAACCTGGAG 
TGTGTTCACG 
AGTCATCTCT 
CATGTACCAA 
CACAATAACX3 
GCGAGATGCC 
AGACATCAAT 
AGACACCCGT 
GAACOGGATG 
GACAAACCCC 
CATCCAGCAA 
GAGCCCTCCC 
CCCCATTTTC 
GATTGGCACA 
CCGCAGGACC 
TGAGAAAGAA 
ACTGGATTCC 
TTTGGATGAG 
GAACGCTGTC 
ACGAAACGTG 
TCAOGATAAC 
GGTCCACTTC 
CACGCTGACC 
TATGGCCGCC 
CACCATCACA 
OGGGCAOGGC 
CGGCGG0GA6 
CGGGGCCAAG 
GAAGCCACCG 
CGAGGTGAAG 
CATCTAC6GC 
CTCATCGGAC 



60 
120 
180 
240 
300 
360 
420 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160- 
2220 
2280 
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TCT GA OGTGG ATTAOGACTT CCTTAAOGAC TOGGGACCCA GGTTTAAGAT GCTGGCTGAG 2340 

CTGTAOGGCT OGGACCCCGG GGAGGA6CTG CPGTATTAGG GGGCGGAGGT CACTCTGGGC 2400 

CTGGGGACCC AAACCCCCPG CAGCCCAOGC CACTCAGACT OCAGGCACCA CAGGCTOCAA 2460 

AAATOGCAGT GACTCCCCAG CCCAGC3^CCC CTIC C I' O GTO GGTCCCAGflG ACCTCATCAG 2520 

CCrPQC3GATA GCAAACTCCA GGTTCCTGAA ATATCCAGGA ATATATGTCA GTGATGACTA 2580 

TTCTCAAATG CTGGCAAATC CAGGCTGGTG TTCTGTCTGG GCTCAGACAT CX3^CATAACC 2640 

CTGTCACCOV CAGACOGCOG TCTAACTCAA AGACTTCCTC TGGCTCCCCSl AGGCTGCAAA 2700 

GCAAAACAGA CTGTGTrTAA CTGCTGCAGG tfl Xr iT r riLT AGGGTCCCTG AAC GCOC TGG 2760 

TAAGGCTGGT GAGGTCCTGG TGCCTATCTG CCTGGAGGCA AAGGOCTGGA OUSCTTGACT 2820 

TGTGGGGCAG GATTCTCTGC AGCCCATTCC CAAGGGAGAC TGACCATCAT GCXXTTCTCTC 2880 

GGGAGCCCTA GCCCTGCTCC AACTCCATAC TCCACTCCAA GTGCCCCACC AC TCCC CAAC 2940 

CCCTCTCCAG GCCTGTCAAG AGGGAGGAAG GGGCCCCATG GCAGCTCCTG " ACCTTGGGTC 3000 

CXGAAGItSAC CTCACTGGCC TGCCATGCCA GTAACTGTGC TGTACTGAGC ACTGAACCAC 3060 

ATTCAGGGAA ATGCTTATTA AACCTTGAAG CAACTGTGAA TTCATTCTGG AGGGGCAGTG 3120 

GAGATCAGGA GTGACAGATC ACAGGGT6AG GGCCACCTCC ACA CCCACC C OCTCrSG AGA 3180 

AGGCCTGGAA GAGCTGAGAC CTTGCTTTGA GACTCCTCAG CACOOCTGCA GTTTPGCCTG 3240 

AGAAGGGGCA GATGTTCCOG GAGATCAGAA GAOGTCTCCC CTTCTCTGCC TCAOCTGGTC 3300 

GCCAATCCA7 GCTCTCTTTC TTTTCTCTGT CTACTCCTTA TCCCTTGGTT TAGAGGAACC 3360 

CAAGATGTGG CCTTTAGCAA AACTGACAAT GTCCAAACCC ACTCATGACT GCATGACGGA 3420 

GCCGAGCATG TGTCTTTACA CCTCGCTGTT GTCACATCTC AGGGAACT6A CCCTCAGGCA 3480 

CACCTTGCAG AAGGAAG6CC CT G OOCTGCC CAACCTCTGT GGTCAOOCAT GCATCATTCC 3S40 

ACTOGAAOGT TTCACP6CAA ACACACCTTG GAGAAGTGGC ATCAGTCAAC AGAGAGGGGC 3600 

AGGGAAGGAG ACACCAAGCT CACCCTTCX?r CATGGACCGA GGTTCCCACT CTGGCAAAGC 3660 

CCCTCACACT GCAAGGGATT GTAGATAACA CTGACTTGTT TGTTTTAACC AATAACTAGC 3720 

TTCTTATAAT GATTTTTTTA CTAATGATAC TTACAAGTTT CTAGCTCTCA CAGACATATA 3780 

GAATAACGGT TTTTGCATAA TAAGCAGGTT GTTATTTAGG TTAACAATAT TAATTCAGGT 3840 

TTTTTAGTTG GAAAAACAAT TCCTGTAAOC TTCTATTTTC TATAATTGTA GTAATTGCTC 3900 

TACAGATAAT GTCIATATAT TGGCCAAACT GGTGCATGAC AAGTACTGTA TTTTTTTATA 3960 
CCTAAATAAA GAAAAATCTT TAGCCTGGGC AACAAAAAAA 

Seq ID NO] 339 Protein sequence 
Protein Accession #: NP_001786 

1 11 21 31 41 51 

I i I I I 1 

MQRLMMIiLAT SGACLGLLAV AAVAAAGANP AQRDTHSLLP THRRQKRDMI HNQMHIDEEK 60 

NTSLPHHVGK IKSSVSRKNA KYLLKGEYVG KVFRVDAETG DVFAIERLDR ENISBYHLTA 120 

VIVDKDTCEM LETPSSFTIK VHDVKDIIHFV FTHRLFNASV PESSAVGTSV ISVTAVDADD 180 

PTVGOHASVM YQILXGKEYP AIDNTSGRIZT ITKSLDREKO ARYEIWEAS DAQGLRGDSG 240 

TATVLVTLQO INDNFPFFTQ TKYTPWPED TRVGTSVGSL PVEDPDEPQN RMTKVSILRG 300 

DYQDAFTIET NPAHNEGIIK PMKPLDYEYI QQYSFIVEAT DPTIDLRYMS PPA<2JRAQVI 360 

INITDVDEPP IFQQPFYHFQ LKENQKKPLI GTVLAMDPDA ARHSIGYSIR RTSDKGQFPR 420 

VTKKGDIYNE KEUJREVYPW YNLTVEAKEIi DSTGTPTGKE SIVQVHIBVL DENDMAPEFA 480 

KPYQPKVCEN AVHGQLVLQI SAIDKDITPR NVKFKPTLNT ENNFTLTDNH DNTANITVKY 540 

GQFDREHTKV HFLPWISDN GMPSRTGTST LTVAVOCCNE QGBFTFCEDM AAQVGVSIQA 600 

WAILLCIIiT ITVITIiIPL RRRIJIKQARA HGECSVPEIHB QLVTVDEEGG GEMDTTSYDV 660 

SVLNSVRRGG AECPPRPALDA RPSLYAQVQK PPRKAPGAHG GPGQfAAMIB VKKDEADflDG 720 

DGPPYDTLHI YGYSGSESIA ESLSSL6TDS SDSDVDYDFL NDWGPRFXNL AELYG5DPSC 780 
ELLY 



Seq ZD KO: 340 DNA sequence 
Nucleic Acid Accession ft: NM_003088 
Codii^ sequence: 112-1593 

1 11 21 31 41 51 

I I I I I I 

GCGGAGGGTG CGTGCGGGCC GCXX3CAGCCG AACAAAGGAG CAGGGGCX5CC GCCGCAGGGA 60 

CCCGCCACCC ACCTCCCG6G GCCGG6CAGC GGCCTCTCGT CTACTGCCAC CATGACCGCC 120 

AA0G6CACAG CCGAGGCGGT GCAGATCCAG TTCGGCCTCA TCAACTGCGG CAACAAGTAC 180 

CTGAC36GCCG AGGCGTTOGG GTTCAAGGTG AAOSOGTCOG CCAGCAGCCT GAAGAAGAAG 240 

CAGATCTGGA OGCTGGACCA GCCXXXTTGAC GAGGCGGGCA GOGOGGCOGT GTGCCTGCGC 300 

AGCCACCTGG GCCGCTACCT GGCGGOGGAC AAGGACGGCA ACGTGACCTG OGAGOGCGAG 360 

GTGCCX»GTC CXX3ACTGCXX3 TTTCCTCATC GTGGCGCACG KCChQOGTCG CTGGTCGCTG 420 

CAGTOCGAGG OGCACCGGCG CTACTTCGGC GGCACCGAGG ACCGCCTGTC CTGCTTCGCG 480 

CAGACGGTGT CCCCOGCGGA GAAGTGGAGC GTGCACATOG CCATGCACCC TCAGGTCAAC 540 

ATCTACAGTG TCAOOOGTAA GCGCTAOSOG CACCTGAGOG CGOGGOOGGC OGAOGAGATC 600 

GOCGTGGACC GOGACGTGCC CTGGQGOGTC GACTCGCTCA TCACCCTCGC CTT0CA6GAC 660 

CAGCGCTACA GCGTGCAGAC CGCCGACCyVC CXKmCCTGC GCCACX5ACGG GCGCCTGGT6 720 

GCGOGCCCCG AGCOGGCCAC TGGCTACACG CTGGAGTTCC GCTTCOGGCAA GGTGGCCTTC 780 

CGCGACTGCG AGGGCCGTTA CCTGGCGCCG TCGGGGCCCA GOGGCAOGCT CAAGGCGGGC 840 

AAGGCCACCA AGGTGGGCAA GGACGAGCTC TTTGCTCTGG AGC3«3AGCTG CGCCCAGGTC 900 

GTGCTGCAGG CGGCCAAGGA GAGGAACGTG TCCA060G0C AGGGTATGGA CCTGTCTGCC 960 

AATCAGGACG AGGAGACCXSA CCAGGAGACC TTCCAGCTGG AGATCGACCG OGACACCAAA 1020 

AAGTGTGCCT TCOGTACCCA CAOGGGCAAG TACTGGAOGC TGAOGGCCAC 0GGGGG06TG 1080 

CAGTCCACCG CCTCCAGCAA GAATGCCAGC TGCTACTTTG ACATOGAGTG GCGTGACCGG 1140 

CGCATCACAC TGAGGGCGTC CAATGGCAAG TTrGTGACCT CCAAGAAGAA TGGGCAGCTG 1200 

GCCGCCTCGG TGGAGACAGC AGGGGACTCA GAGCTCTTCX: TCATGAAGCT CATCAACCGC 1260 

CCCATCATCG TGTTCCGCX3G GGAGCATGGC TTCATOGGCT GCCGCAAGGT CAOGGGCACC 1320 

CTGGAOGCXav AC0GCTCX31G CTATGACX3TC TTCCAGCTGG AGTTCAAOGA TGGOGCCTAC 1380 

AACATCAAAG ACTCCACAGG CAAATACTGG AOSGTGGGCA GTGACTCOGC GGTCAOCAGC X440 

AGCGGCGACA CTCCTGTGGA CrTCTTCTTC GAGTTCTGOG ACTATAACAA GGTGGCCATC ISOO 

AAGGTGGGOG GGOGCTACCT GAAGGGOGAC CAOGCAGGOG TCCTGAAGGC CTOSGCGGAA 1560 

ACCGTGGACC CCGCCTOGCT CTGGGAGTAC TAGGGCCGGC CCGTCCTTCC COGOCOCTGC 1620 

CCACATGGCG GCTCCTGCCA ACCCTCCCTG CTAACCCCTT CTCCGCCAGG TGGGCTCCAG 1680 

6G06GGAGGC AAGCCCCCTT GCCTTTCAAA CIOQAAACOC CAGAGAAAAC GGT6C0C0CA 1740 

CCTGTCGCCC CTATGGACTC CCCACTCTCC CCTCCGOCOG GOTTCCCTAC TCCCCTCGGG 1800 
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TCAGCGGCTG OGGCCTGGCC CTGGGAGGOi TTTCAGATGC CCCTGCCCTC riVfCfGCCA I860 

OGGGGCGAGT CTGGCACCTC TTTCTTCTGA CCTCAGAOGG CTCTGAGCCT TATTTCTCTO 1920 

GAAGCGGCTA AGGGACGGTT GGGGGCTGGG AGCCCTCGGC GTGTftGTGTA ACTGGAATCT 1980 

TTTGCCTCTC CCAGCCACCT CaOXMXC CCCCAGGAGA GCTGGGCACA TGTCCC AAGC 2040 

CTGTCAGTGG CCCTOOCTGG TGCACT6TCC COGAAACXXX: TGCTTGGGAA GGGAAGCTOT 2100 

CQGGftGGGCT AGGACTGACC CTTGTGGTGT TTTTTTOGGT GGTGGCTGGA AACAGCCCCT 2160 

CTCCCACGTC GGAGAGGCTC AGCCPGGCTC CCTTCCCTGG AGOGGCAGGG OGTGAOGGCC 2220 

ACAGGGTCTG CCGGCTGCAC GTTCXGCCAA GGTGGTQGTG COGGGCtSGGT AGGGGTGTG6 2280 

GGGCOGTCTT CCTCCTGTCT CTTTCCTTTC ACCCTAGCCT GACTGGAAGC AGAAAAT6AC 2340 

CAAATCAGTA TTTTTTTTAA TGAAATATTA TTCCTGGAOG CGTCCCAGGC AAGCCTGGCT 2400 

GTAGTAGCGA GTGATCTGGC GGGGGGCGTC TCflCCACXXT CCCCAGGGGG TGCATCTCAG 2460 

CCCCCTCTTT CCGTCCTTCC CGTCCAOXC CAGCCCTGGG CCTGGC3CTGC OGACACCTCG 2520 

GCCAGAGCCC CTGCTCTGAT TGGTGCTCOC TQQGCCTCCC GGGTGGATGA AGCCAQGCGT 2S80 

CGCCCCCTCC GGGAGOCCTG GGGTGAGCCG COGGGGCCCC CCTGCTGCCA GOCTC OCCCG 2640 

TCCCCAACAT GCATCTCACT CTGG6TGTCT TGGTCTPTTA TTTTTTGTAA (5TGTCATTTG 2700 

TATAACTCTA AAOGCCCATG ATAGTAGCTT CAAACTGGAA ATAGCGAAAT AAAATAACTC 2760 
AGTCTGC 

Seq ID NO: 341 Protein sequence 
Protein Accession «: IIP_003079 

1 11 21 31 41 51 

ilTANGTAEAV QICFGIjINCG NKYLTAEAFG FKVNASASSL KKKQIVfTIiBQ PPDEAGSAAV 60 

CLRSHLGRYL AADKDGNVTC EREVPGPDCR PLIVAHDUGR WSLQSEAHRR YFGGTBDShS 120 

CFAQTVSPAE KWSVHIAMHP OVNIYSVTRK RYAHIiSARPA DBIAVDRDVP f4GVDSLZTLA 180 

PQDQRYSVQT ADHRPLRHDG RLVARPEPAT GYTLEPRSGK VAPRDCBGRY LAPSGPSGTL 240 

KAGKATKVGK DELFALEQSC AQWLQAANE RNVSTRQGMD LSANQDEETD QSTFQLEIDR 300 

DTKKCAFRTH TGKYWTLTAT GGVQSTASSK NASCYFDIEW RDRRITLRAS NGKFVTSKKN 360 

GQLAASVETA GDSELPLMKL INRPIIVFRG EHGPIGCRKV TGTLDANRSS YDVFQLEPND 420 

GAYHIKDSTG KYWTVGSDSA VTSSGDTPVD FFFEFCDYNK VAIKVGORYL KGDHAGVLKA 480 
SAETVDPASL WEY 



Seq ID NO: 342 DNA sequence 

Nucleic Acid Accession §: FGENESH predicted 

Ooding sequence: 660. .1705 

1 11 21 31 41 51 

OGCTCCGCAC ACATTTCCTG TCGCGGCCTA AGGGAAACTG TTGGCCGCTG GGCCCGCGGG 60 

GGGATTCTTG GCAGTTGGGG GGTCOGTCGG GAGCGAGGGC GGAGGGGAAG GGAGGGGGAA 120 

CCX3GGTTGGG GAAGCCAGCT GTAGAGGGCX3 GTGACCGCGC TCCAGACACA GCTCTGCGTC 180 

CTCGAGCGG6 ACAGATCCAA GTTGGGAGCA GCTCTGCGTG CGGGGCCTCA GAGAATGAGG 240 

CCGGCGTTC6 CCCTGTGCCT CCTCTGGCAG GCGCTCTGGC CCGGGCCGGG CGGCGGCGAA 300 

CACCCCACTG CCGACCGTGC TGGCTGCTCG GCCTCGGGGG CCTGCTACAG CCTGCACCAC 360 

GCTACCATGA AGCGGCAGGC GGCCGAGGAG GCCTGCATCC TGCGAGGTGG GGOGCTCAGC 420 

ACCGTGCGTG CGGGCGCCGA GCTGCGCGCT GTGCTCGOtSC TCCTGCGGGC AGGCCCAGGG 480 

CCCGGAGGGG GCTCCAAAGA CCTGCTGTTC TGGGTCGCAC TGGAGCGCAG GCGTTCCCAC 540 

TCCACCCTGG AGAACGAGCC TTTGCGGGGT TTCTCCTGGC TGTCCTCOGA CCCCGGCGGT 600 

CTCGAAAGCG ACACGCTGCA GTGGGTGGAG GAGCCCCAAC GCTCCTGCAC CGCGCGGAGA 660 

TCCXKX3GTAC TOCAGGCCAC CGGTQQGGTC GAGCOOGCAG CTGGAAGGAG ATGCGATGCC 720 

ACCTGCGCGC CAAOG6CTAC CT6TGCAAGT ACCAGTTTGA GGTCTTGTGT CCTGCX3CGGC 780 

GCCCCGGGGC CGCCTCTAAC TTGAGCTATC GCGCGCCCTT CCAGCTGCAC AGCGCOSCTC 840 

TGGACTTCAG TCCACCTGGG ACCGAGGTGA GTGCGCTCTG CCGGGGACAG CTCCCGATCT 900 

CAGTTACTTG CATCGOGGAC GAAATCGGCG CTCGCTGGGA CAAACTCTCG GGCGATGTGT 960 

TCT6TCCCTG COCCXXMAGG TACCTCCGTG CTGGCAAATG CGCAGAGCTC CCTAACTGCC 1020 

TASACGACTT GGGAGGCTTT GCCTGCGAAT GTGCTACGGG CTTCGAGCTG GGGAAGGACG 1080 

GCCGCTCTTG TGTGACCAGT GGGGAAGGAC AGCCGACCCT TGGGGGGACX: GGGGTGCCCA 1140 

CCAGGCGCCC GCCGGCCACT GCAACCAGCC COGTGCCGCA GAGAACATG6 OCAATCAGGG 1200 

TCGACGAGAA GCTGGGAGAG ACACCACTTG TGCCTGAACA AGACAATTCA GTAAC ATCTA 1260 

TTCCTGAGAT TCCTCGATGG GGATCACAGA GCACGATGTC TACCCTTCAA ATGTCCCTTC 1320 

AAGCCGAGTC T^GGCCACT ATCACCCCAT CAGGGAGCGT GATTTCCAAG TTTAATTCTA 1380 

CGACTTCCTC TGCCACTCCT CAGGCTTTOG ACTCCTCCTC TGCX3GTGGTC TTCATATTTG 1440 

TGA6CACAGC AGTAGTAGTG TT6GTGATCT YGACCAIGAC AOTACTGGGG CTTGTCAA,GC 1500 

TCTGCTTTCA CGAAAGCCCC TCTTCCCAGC CAAGGAAGGA GTCTATGGGC CCGCCGGGCC IS 60 

TGGAGAGTGA TCCTGAGCCC GCTGCTTTGG GCTCCAGTTC TGCACATTGC ACAAACAATG 1620 

GGGTGAAAGT CGGGGACTGT GATCTGOGGG ACAGAGCAGA GGGTGCCnG CTGGOQGAGT 1680 
CCCCTCTTGG CTCTAGTGAT GCATAG 

Seq ID NO.* 343 Protein sequence 
Protein Accession #: FGENESH predicted 

1 11 21 31 41 51 

I ) i { t i 

MGKDFMTKTP KAFATKAKID KWDLIKLKSF CTAKETIIRV NSQPTDWQKT PAIYPSDKGV 60 

lARIYKELEQ lYKKKKPTKT LRTKFLSRPK GNCWPLGPRG DSWQLGGPSG ARAEGKGGGT 120 

6LCHCPAVBQG DRAPOTALRP RAGQIQVGSS SAOGASENEA GVRPVPPLAG ALARAGRRRT 180 

PHCRPCWLLG LGGLLQPAPR YHEAAGGRGG LHPARWGAQH RAOGRRAARC ARAPAGRPRA 240 

RRGLQRPAVL GRTGAQAPPti HPGERAPAGF LLAVLRPRRS RKRHAAVGGG APTLLHRAEM 300 

RGTPGHRWGR ARSWKEMRCH LRANGYLCKY QFBVLCPAPR PGAASOT.SYR APFQLHSAAL 360 

DFSPPGTEVS ALCRGQLPIS VTCIADEIGA RWDKLSGDVL CPCPGRYLRA GKCAEltPNCL 420 

DDLGGFACEC ATGPELGKDG RSCVTSGEGQ PTLGGTGVPT RRPPATATSP VPQRTWPIRV 480 

DEKI/5ETPLV PEQDNSVTSI PEIPRWGSQS TMSTLQMSLQ AESKATITPS GSVISKFNST 540 

TSSATPQAFD SSSAWFIPV STAWVLVIL TMTVLGI.VKL CFHESPSSQP RKESMGPPGL 600 
ESDPEPAAXXS SSSABCTHNG VKVGDCDXJtD RABGALLAES PU3SSDA 
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Seq ID NO: 344 DNA sequence 
Kucleic Acid Accession §: 1«_012072 
Coding sequence: 149-2107 

1 . 11 21 31 41 51 

IaAGCCCTCA GCCrrrGTGT CCTTCTCTGC QCCQGRGnOG CTGCAGCCCA COCCTCAGCT €0 
CCCCTTCGGG aXaGCTGGG AGCCGAGATA GAAGCTCCTO TOGCCGCTGG GCTTCTOGCC 120 
TCCCGCAGAG GGCCACACAG AGACOGGGAT GGCCACCTCC ATGGGCCTGC TGCTGCTGCT 180 
GCTGCroCTC CTGACCCAGC CCGGGGOGGG GAC3GQGAGCT GACACGGAGG CGGTGGTCTG 240 
CGTGGGGACC GCCXGCTACA OGGCCCACTC GGGCAAGCTG AGCGCTCCCG AGGCCXAGAA 300 
CCACTCCAAC CAGAAOCGGG GCAACCTGGC CACTGTGAAG AGCAAGGAGG AGGCCCAGCA 360 
CGTCCWCGA CTACTQGCCC AGCTCCTCAG GCGGGAGGCA GCCCTGAOGG CGAGGATG^ 420 
CAAGTTCTCG ATTGGGCTCC AGOGAGAGAA GGGCAAGTGC CTGGACCCTA GTCTGCOGCT 480 
S^S^ AGCtSgtS GCGGGGGGGA GGACACGCCT TACTCTAACT GGCftCAAGGA S40 
GCrCOQCSAAC TCGTGCATCT CCAAGOGCTG TGTGTCTCTG CTGCTGGACC TGTCCCAGCC 600 
^SSttSc AACOGCCTGC CCAAGTGGTC TGAGGGCCCC TGTGGGAGCC CAGGCTCOCC 660 
O^StSc ATTGAGGGCT TCGTGTGCAA GTTCACCTTC AAAGGCATGT GCOQGCCTCT 720 
^ScCTGGGG GGCCCAGGTC AGGTGACCTA CACCACCCCC TTCCAGACCA CCAGTTCCTC 780 
CTtSa^ GTCCCCTTTC CCTCIGCGGC CAATCTA<SCC TGTGGGGAAG GTGACAAGGA 840 
CGAGACTCAG AGTCATTATT TCCTCTGCAA GGAGAAGGCC CCCGATGTGT TCGACTC^ 900 
CAGCrCGGGC CCCCTCTCTC TCAGCCCXaUV GTATGGCTGC AACTTCAACA ATGGG^CTG 960 
CCACCAGGAC TGCTTTGAAG GGGGGGATGG CTCCTTCCTC TGCGGCTGCC GACCAGGATT 1020 
CCGGCTGCT6 GATCACCTGG TGACCTGTGC CTCTCGAAAC CCTTGCAGCT CCAGCCCMG 1080 
TCGTGGGGGG GCCACGTGCG TCCTGGGACC CCATGGGAAA AACTACACGT GCCGCTGCCC 1140 
CCAAGGGTAC CAGCTG6ACT CGAGTCAGCT GGACTGTGTG GACGTGGATG AATGCCAGGA 1200 
CTCCCCCTGT GCCCAGGACT GTGTCAACAC CCCTGGGGGC TTCGGCTGCG AATGCTGGCT 1260 
TCGCTATGAG C<X5GG0GGTC CTGGAGAGGG GGCCTGTCAG GAT^^ "20 
GGGTCGCTOS CCTTCCGCCC AGGGCTGCAC CAACACAGAT GGCTCATTTC ACTGCTCCTG 1380 
IGAGGAGGGC TACGTCCTGG CCGGGGAGGA CGGGACTCAG TGCCAGGACG TGGATGAGTG 1440 
TCTGG6C0CG GGGGGCCCCC TCTGCGACAG CTTGTGCrTC AACACACAAG GGTCCTTCCA ISOO 
CTOTGGCTGC CTGCCAGGCT GGGTGCTGGC CCCAAATCGG GTCTCTTGCA CCATGGGG^ 1560 
TGTGTCTCTG GGACCACXAT CT^CCCC CGATGAGGAO GA^^ 1620 
GAGCACCGTC CCCCGCGCTG CAACAGCCAG TCCCACAAQG GGCCCOGAGG GCAO^CAA 1680 
GGCTACACCC ACCACAAGTA GACCTTCGCT GTCATCTGAC GCtXXXSVTCA CATCTGCCCC 1740 
ACTCAAGATG CTGGCCCCCA GTGGGTCCTC AGGCGTCTGG AGGGAGCCCA GCATCCATCA 1800 
CGCCACAGCT GCCrCTGGCC CCCAGGAGCC TGCAGGTGGG GACTCCTCCG TGGCCACACA 1860 
AAACAAOGAT GGCACTGACG GGCAAAAGCT GCTTTTATTC TACATCCTAG GCACCGTGGT 1920 
GgSSStA CTCCTGCTGG CCCTCGC^ "80 
GAAGAGGGAG GAGAAGAAGG AGAAGAAGCC CCAGAATGCG GCAGACAGTT ACTCCTGGGT 2040 
TCCAGAGCGA GCTGAGAGCA GGGCCATGGA GAACCAGTAC AGTCCGACAC CTGGGACAGA 2100 
CTGCTGAAAG TGAQGTGGCC CTAGAGACAC TAGAGTCACC AGCCACCATC CTCAGAGCTT 2160 
TGAACTOCCC ATTCCAAAGG GGCACCCACA TTTTTTTGAA AGACTCGACT GGAATCTTAG 2220 
CAAACAATTG TAAGTCTCCT CCTTAAAGGC CCCTTGGAAC ATGCAGGTAT TTTCTACGGG 2280 
TGTTTGATGT TCCTCAAGTG GAAGCTGTGT GTTGGCGTGC CAGGGTGGGG ATTTCGTGAC 2340 
TCTATAATGA TTGTTACTCC CCCTCCCTTT TCAAATTOCA ATGTGACCAA TTCCGGATCA 2400 
GGGTGTGAGG AGGCTGGGGC TAAGGGGCTC CCCTGAATAT CTTCTCTGCT CACTTCCACC 2460 
ATCTAAGAGG AAAAGGTGAG TTGCTCATGC TGATTAGGAT TGAAATGATT TGTTTCTCTT 2520 
CCTAGGATGA AAACTAAATC AATTAATTAT TCAATTAGGT AAGAAGATCT GGTTTTrTGG 2580 
TCAAAGGGAA CATGTTCGGA CTGGAAACAT TTCTTTACAT TTGCATTCCT CCATTTCGCC 2640 
AGCACAAGTC TTGCTAAATG TGATACTGTT GACATCCTCC AGAATGGCCA GAAGTGCAAT 2700 
TAACCTCTTA GGTGGCAAGG AGGCAGGAAG TGCCTCTTTA GTTCTTACAT TTCTAATAGC 2760 
CTTGGGTTTA TTTGCAAAGG AAGCTTGAAA AATATGAGAA AAGTTGCrTG AAGTGCATTA 2820 
CAGGTGTTTG TGAAGTCACA TAATCTACGG GGCTAGGGOS AfiAGAGGCCA GGGATTTGTT 2880 
CACAGATACT TGAATTAATT CATCCAAATG TACTGAGGTT AOC ACAC ACT TGACTAOGGA 2940 
TGTGATCAAC ACTAACAAGG AAACAAATTC AAGGACAACC TGTCTTTGAG CCAGGGCAGG 3000 
CCTCAGACAC CCTGCCTGTG GCCCC3GCCTC CACTTCATCC TGCCCGGAAT GCCAGTGCTC 3060 
CGAGCTCAGA CAGAGGAAGC CCTGCAGAAA GTTCCATCAG GCTGTTTCCT AAAGGATGTG 3120 
TGAACGQGAG ATGATGCACT GTGTTTTGAA AGTTGTCRTT TTAAAGCATT TTAGCACAGT 3180 
TCATAGTCCA CAGTTGATGC AGCATCCTGA GATTTTAAAT CCTGAAGTGT GGGTGGOSCA 3240 
CACACCAAGT AGGGAGCTAG TCAGGCAGTT TGCTTAAGGA ACTTTTGTTC TCTGTCTCTT 3300 
TTCCTTAAAA TTGGGGGTAA GGAGGGAAGG AAGAGGGAAA GAGATGACTA ACTAAAATCA 3360 
TTTTTACAGC AAAAACTGCT CAAAGCCATT TAAATTATAT CCTCATTTTA AAAGTTACAT 3420 
TTGCAAATAT TTCTCCCTAT GATAATGCAG TCGATAGTGT GCACTCTTTC TCTCTCTCTC 3480 
TCTCTCTCAC ACACACACAC ACACACACAC ACACACACAC AGAGACACGG CACCATTCTG 3540 
CCTGGGGCAC TGGAACACAT TCCTGGGGGT CACOCSATGGT CAGAGTCACT AGAAGTTACC 3600 
TGAGTATCTC TGGGAGGCCT CATGTCTCCT GTGGGCTTTT TACCAOGACT GTCCAGGAGA 3660 
ACAGACAGAG GAAATGTGTC TCCCTCCAAG GCCCCAAAGC CTCAGRGAAA GGGTCTTTCT 3720 
GCTTTTGCCT TAGCAATGCA TCGGTCTCTG AGGTGACACT CTGGAGTGGT TGAAGGGCCA 3780 
CAAGGTGCAG GGTTAATACT CTTGCCAGTT TTGAAATATA GATGCTATGG TTCAGATTGT 3840 
TTTTAATAGA AAACTAAAGO GGCAGGGGAA GTGAAAGGAA AGATGGAGGT TTTGTGCGGC 3900 
TCGATGGGOC ATTTGGAACT TCTTTTTAAA GTCATCTCAT GGTCTCCAGT TTTCAGTTGG 3960 
AACTCTGGTG TTTAACACTT AAGGGAGACA AAGGCT6TGT CCATTTGGCA AAACTTCCTT 4020 
GGCCACGAGA CTCTAGGTGA TGTGTGAAGC TGGGCAGTCT GTGGTGTGGA GAGCAGOCAT 4080 
CTGTCTGGCC ATTCAGAGGA TTCTAAAGAC ATGGCTGGAT GCGCTGCTGA CCAACATCAG 4140 
CACTTAAATA AATGCAAATG CAACATTTCT CCCTCTGGGC CTTGAAAATC CTTGCOCTTA 4200 
TCATTTGGQG TGAftGGAGAC ATTTCTGTCC TTGGCTTCCC ACAGCCCCAA CGCAGTCTGT 4260 
GTATGATTCC TGGGATCCAA CGAGCCCTCC TATTTTCACA CTGTTCTGAT TGCTCTCACA 4320 
GCCCAGGCCC ATCGTCTGTT CTCTCAATGC AGCCCTGTTC TCAACAACAG GGAGGTCATG 4380 
GAACCCCTCT GTGGAACCCA CAAGGGGAGA AATGGGTGAT AAAGAATCCA GTTCCTCAAA 4440 
ACCTTCCCTG GCAGGCTGGG TCCCTCTCCT GCTGGGTG6T GCTTTCTCTT GCACACCACT 4S00 
CCCACCACGG GGGGAGAGCC AGCAACCCAA CCAGACAGCT CACGTTGTGC ATCTGATGGA 4560 
AACCACTGGG CTCAAACACG TGCTTTATTC TCCTGTTTAT TTTTGCTGTT ACTTTGAAGC 4620 
ATCGAAATTC TTGTTTGGGG GATCTTGGGG CTACAGTAGT GGGTAAACAA ATGCCCACOG 4680 
CCCAACAGGC CATTAACAAA TOGTCCTTCT CCTGAGGGGC CCC3U3CTTGC TCGGGCGTGG 4740 
CACAGTCGGG AATCCAAGGG TCACAGTATG GGGAGAGGTG CACCCTGCCA CCTCCIAACT 4800 
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■ TCTOGCTftGA CACAGTGTTT CTGCCCAGGT GAOCTGTTCA GCAGCftSAAC AAGCCAGGGC 4860 

CMGGGGAOG GGGGAAGTTT TCACTTOGAG ASGGACACCA AGACAATGAA GATTTGTTGT 4920 

CCAA&TAGGT CAATAATTCT GGGAGACICT TGGAAAAAAC TGAATATATT CAGGACXAAC 4980 

TCTCTCCCTC CCCTCATCCC ACATCTCAAA GCAGACAATG TAAAGAGftGA ACATCTCACA 5040 

CAOCCAGCTC GCCATGCCTA CTCATTCCTG AATTTCAGGT GCCATCACTG CTCTTTCTTT 5100 

CTTCTTTGTC ATTTGAGAAA GGATGCAGGA GGACAATTCC CACAGATAAT CTGAGGAATG 5160 

CAGAAAAACC AGGGCAGGAC AGTTATCGAC AATGCATTAG AACTTGGTGA GCATCCTCTG 5220 

TAGAGGGACT CCACCCCT6C TCAACACCTT GGCTTCCAQG CAAGACCAAC CACATCTGOT 5280 

CTCTGCCTTC GGTGGCCCAC ACACCTAAGC GTCATCGTCA TTGCCATAGC ATCATGATGC 5340 

AACACATCTA CGTGTAGCAC TAOGACGTTA TGTTTGGGTA ATGTGGGGAT GAACTGCATG 5400 

ACGCTCTGAT TAAGGATGTG GGGAAGTGGG CTGOGGTCAC TGTCGGCCTT GCAAGGCCAC 5460 

CTGGAGGCCT GTCTGTTAGC CAGTGGTGGA GGAGCAAGGC TTCAGGAAGG GCCAGCCACA 5520 

TGCCATCTTC CCTGOGATCA GGCAAAAAAG TGGAATTAAA AAGTCAAACC TTTATATGCA 5580 

TGTGTTATGT CCATTTTGCA GGATGAACTG AGTTTAAAAG AATTTTTTTT TCTCTTCAAG 5640 

TTCCTTTGTC TTTTCCATCC TCATCACAAG CCCTTGTTTG AGrGTCTTAT CCCTGAGCAA 5700 

TCTTTGGATG GATGGACATG ATCATTAGGT ACTTTTGTTT CAAOCTTTAT TCCTGTAAAT S760 

ATTTCTGTGA AAACTAGGAG AACAGAGATG AGATTTGACA AAAAAAAATT GAATTAAAAA 5820 

TAACACAGTC TTTTTAAAAC TAACATAGGA AAGCCTTTCC TATTATTTCT CTTCTTAGCT 5880 

TCTCCATTCT CTAAATCAGG AAAACAGGAA AACACAGCTT TCTAGCAGCT GCAAAATGGT 5940 

TTAATGCCCC CTACATATTT CCATCACCTT GAACAATAGC TTTAGCTTGG GAATCTGAGA 6000 

TATGATCCCA GAAAACATCT GTCTCTACTT OGGCTGCAAA ACCCATGGTT TAAATCTATA 6060 

TGGTTTGTGC ATTTTCTCAA CTAAAAATAG AGATGATAAT COGAATTCTC CATATA TTCA 6120 

CTAATCAAAG ACACTATTTT CATACTAGAT TOCTGAGACA AATACTCACT GAAGGGCTTG 6180 

TTTAAAAATA AATTGTGTTT TGGTCTGTTC TTCTAGATAA TGCCCTTCTA TTTTAGGTAG 6240 

AAGCTCTGGA ATCXTCTTTAT TGTGCTGTTG CTCTTATCTG CAAGGTGGCA AGCA U'n'OTi' 6300 

TTCAGCAGAT TTTGOCCACT ATTCCTCTGA GCIGAAGTTC TTTGCATAGA TTTGGCTTAA 6360 

GCTTGAATTA GATCCCTGCA AAGGCTTGCT CTGTGATGTC AGATGTAATT GTAAATGTCA 6420 

GTAATCACTT CATGAATGCT AAATGAGAAT GTAAGTATTT TTAAATGTGT GTAT TTCAA A 6480 

TTTGTTTGAC TAATTCTGGA ATTACAAGAT TTCTATGCAG GATTTACCTT CATCCTGTGC 6540 

ATGTTTCCCA AACTGTGAGG AGGGAAGGCT CAGAGATCGA GCTT CTCCTC TGAG TTCTAA 6600 

CAAAATGGTG CTTTGAGGGT CAGCCTTTAG GAAGGTGCAG CTTTGTTGTC CTTTGAGCTP 6660 
TCTGTTATGT GCCTATCCTA ATAAACTCTT AAACACATT 

Seq ID HO: 345 Protein sequexuse 
Protein Accession it NP_036204 

1 11 21 31 41 51 

I I ) I i I 

HATSMGLLLL LLLLLTQPGA GT6ADTEAW CVGTACYTAH SGKLSAAEAQ NHCNQNGGNL 60 

ATVKSKEEAQ HVQRVLAQLL RREAALTARM SKFMIGLQRE KGKCLDPSLP LKGPSWVGGG 120 

EOTPYSNMHK ELRNSCISKR CVSLLLDLSQ PLLPNRLPKM SEGPCX^PGS PGSNIBGFVC 180 

KFSFKGMCRP LALGGPGQVT YTTPFQTTSS SLEAVPPASA ANVACGEGDK DETQSHYFLC 240 

KEKAPDVFDW GSSGPLCVSP KYGCNFNNGG CHQDCFEGGD GSPLCGCRPG PRLLDDLVTC 300 

ASRNPCSSSP CRGGATCVIjG PHGKNYTCRC PQGYQLDSSQ LDCVDVDECQ DSPCAOECVN 360 

TPGGFRCECM VGYEPGGPGE GACQDVDBCA LGRSPCAQGC TNTDGSFHCS CEEGYVLAGE 420 

DGTWQDVDB CVGPGGPLCD SLCFNTQGSP HCGCLPGWVL APNGVSCTMG PVSLGPPSGP 480 

PDEEDK6EKE GSTVPRAATA SPTRGPE6TP KATPTTSRPS LSSDAPITSA PLKHLAPSQS 540 

SGVHRBPSIH RATAASGPQE PAGGDSSVAT QNMSGTDGQK LLIiFyZl.GTV VAILLLLALA 600 
LGLLVYRKRR AKREEKKEKK PQNAADSYSSI VPE8AESRAM ENQYSPTPGT DC 

Seq 3C0 NO; 346 DNA sequence 
Nucleic Acid Accession Ui Z31560 
Coding sequence: <l-966 

1 11 21 31 41 SI 

1 I I I I i 

CACAGCGCCC GCATGTACAA CATGATGGAG ACGGAGCTGA AGCCGCGGGG CCCGCAGCAA 60 

ACTTCGGGGG GCGGCGGCGG CAACTCCACC GCGGCGGCGG CCGGCGGCAA CCAGAAAAAC 120 

AGCCCGGACC GCGTCAAGCG GCCCATGAAT GCCTTCATG6 TGTGGTCOCG OGGGCAGCGG 180 

OGCAAGATGG CCCAGGAGAA CCCCAAGATG CACAACTCG6 AGATCAGCAA GCGGCTGGGC 240 

GCCGAGTGGA AACTTTTGTC GGAGAGGGAG AAGOGGCCGT TCATCGACGA GGCTAAGCGG 300 

CTGCGAGCGC TGCACATGAA GGAGCACCCG GATTATAAAT ACCGGCCCCG GCGGAAAACC 360 

AAGACGCTCA TGAAGAAGGA TAAGTACACG CTGCCCGGOG GGCTGCTGGC CCCCGGCGGC 420 

AATAGCATGG CGAGCGGGGT CGGGGTGGGC GCCGGCCTGG GCGCGGGCGT GAACCAGCGC 480 

ATGGACAGTT ACGCOCACAT GAACQGCTGG AGCAAOGGCA GCTACAGCAT GATGCAGGAC 540 

CAGCTQGGCT ACCOGCAGCA CCCGGGCCTC AATGCGCAOG GCGCAGCGCA GATGCAGCCC 600 

ATGCACCGCT ACGACGTGAG CGCCCTGCA6 TACAACTCCA TGACCAGCTC GCAGACCTAC 660 

ATGAACGGCT CGCCCACXTTA CAGCATGTGC TACTCGCAGC AGGGCACCCC 7GGCATGGCT 720 

CTTGGCTCCA TGGGTTCGGT GGTCAAGTCC GAGGCCAGCT OCAGCCOCCC TGTGGTTACC 780 

TCTTCCTCCC ACTCCAGGGC GCCCTGCCAG GCCGGGGACC TCCGGGACAT GATCAGCATG 840 

TATCTCCCCG GCGCCGAGGT GCCGGAACCC GCCGCCCCCA GCAGACTTCA CATGTCCCAG 900 

CACTACCAGA GOGGCCCGGT GCCCGGCACG GCCATTAACG GCACACTGCC CCTCTCACAC 960 

ATGTGAGGGC CGGACAGOGA ACTGGAGGGG GGA6AAATTT TCAAAGAAAA AOGAGGGAAA 1020 

TGGGAGGGGT GCAAAAGAGG AGAGTAAGAA ACAGCATGGA GAAAACOOGG TAOGCTCAAA 1080 



Seq ID NO: 347 Protein sequence 
Protein Accession CAAS3435 

1 11 21 31 41 51 

I t I 1 I 1 

HSARMmMME TELKPPGPQQ TSGGGGGNST AAAAGGNQKN SPDRVKRPMN AFMVWSRGQR 60 
RKMAQSNPKM HNSEISKRLG AEWKLLSETE KRPPIDEAKR LfiALKHKEHP DYKYRPRRKT 120 
KTLMKKDKrr IiPGGLLAPGG NSMASGVGVG AGLGAGVNQR MDSYAHKNG*} SNGSYSHMQD 180 
QLGYPQHPGL NAHGAAQMQP MHRYDVSALQ YNSMTSSQTY MNGSPTYSMS YSQQGTPGMA 240 
L6SMGSWXS EASSSPPWT SSSHSRA90Q AGOLRDMISM YLPGAEVPEP AAPSRLHMSQ 300 
HYQSGPVPGT AINGTLPLSH M 



317 



wo 02/086443 

Seq ID NO: 348 OKA sequence 
Nucleotide Accession 8: KMJ>02638 
Codio9 sequence: 120-473 

1 11 21 31 41 51 

I I I I 1 I 

CAATACAGCT AACGAATTAT OCCTTGTAAA TACCACAGAC CTGCCCrGGA GCCftGGCCAA 60 
GCTGGACTGC ATAAAGATTG GTATGGCCTT AGCTCTTAGC CAAACACCTT CCTGACACCA 120 
TGACGGCCAG CAGCTTCTTX3 ATOGTGGTSS TGTTOCTCAT CGCTGGGAOG CTCGTTCrMS 180 
AGGCAGCTGT CACGGGAGTT CCTGTTAAAG GTCAAGftCAC TGTCAAAGGC C S TGTTOCAT 240 
TCAATGGACA AGATCCCGTT AAAGGACAAG TTTCAGTTAA AGGTCAAGAT AAAGTCAAAG 300 
CGCAAGAGCC AGTCAAAGGT CCAGTCTCCA CTAAGCCTGG CTCCTGCCCC ATTATCTTGA 360 
TCCGGTCOGC CATGTTGAAT CCCCCTAACC GCTG CTTGAA AGATACTGAC TGCCCAGGAA 420 
TCAAGAAGTG CTGTGAAGGC TCTTGajGGA TGGCCTGTTT OGTTCCCCAG TGAAGGGAGC 480 
axrrCCTTGC TGCACCTGTG COGTOCCCAG AGCTACAGGC CCCATCTGGT CCTAAGTCOC 540 
TGCTGCCCTT OCCCTTCCCA CACTGTCCAT TCTTCCTCOC ATTCAGGATG CCCAGQGCIG 600 
GAGCTGCXrrC TCTCATCCAC TTTCCAATAA A 

Seq ID NO: 349 Protein sequence: 
Protein Accession #: NP_002629 

1 11 21 31 41 51 

I I I I I i 

MRASSFLIW VFLIAGTT.VL EAAVTGVPVK GQDTVKGRVP FNGQDPVKGQ VSVKGC^KVK 60 
AQEFVKGPVS TKFGSCPIIL IRCAMLNPPN RCLKDIDCPG IKKCCEGS06 MACFVPQ 



Seq ID NO: 350 DNA sequence 
Nucleic Acid Accession fHi NM_007183 
Coding sequence: 75-2468 

1 11 21 31 41 51 

I I I I I 1 

GAATTCCGGA CAGGACGTGA AGATAGTTGG GTTTGGAGGC GGCCGCCAGG CCCAGGCCC3G 60 

GTGGACCTGC CGCCATGCAG GACGGTAACT TCCTGCTGTC GGCCCTGCAG CCTGAGGCCX3 120 

GCGTGTGCTC CCTGGCGCTG CCCTCTGACC TGCAGCTGGA GOGOGGGGGC GCOGAGGGGC 180 

CGGAGGCCGA GCGGCTGCGG GCAGCCOGOG TCCAGGAGCA GGTCOGOGGC OGCCTCTTGC 240 

AGCTGGGACA GCAGCCGCGG CACAACGGGG CCGCTGAGCC CGAGCCTGAG GCXGAGACTG 300 

CCAGAGGCAC ATCCAGGGGG CAGTACCACA CCCTGCAGGC TGGCTTCAGC TCTCGCTCTC 360 

AGGGCCTGAG TGGGGACAAG ACCTOGGGCT TCCGGCCCAT a3CX»AGC0G GCCTACAGCC 420 

CAGCCTCCTG GTCCTCCCGC TCCGOCGTGG ATCTGAGCTG CAGTOGGAGG CTGAGTTCAG 480 

CCCACAATGG GGGCAGCGCC TTTGGGGCCG CTGGGTACGG GGGTGCOCAG CCCACCCCTC 540 

CCATGCCCAC CAGGCCCGTG TCCTTCCATG AGCGCGGTGG GGTTGGGAGC OGGGCCGACT 600 

ATGACACACT CTCCCTGCGC TCGCTGCGGC TGGGGCCOGG GGGCCTGGAC GACCGCTACA 660 

GCCTGGTGTC TGAGCAGCTG GAGCCXX30GG CCACCTCCAC CTACAGGGCC TTTGCGTAOG 720 

AGCGCCAG6C CAGCTCCAGC TCCAGCCGGG CAGGGGGGCT GGACTGGCCC GAGGCCACTG 780 

AGGTTTCCCC GAGCOGGACC ATCCXTTOCCC CTGCCGTGCG CACCCTGCAG CGATTCCAGA 840 

GCAGCCACCG GAGCCGGGGG GTAGGGGGGG CAGTGCGGGG GGCCGTOCTG GAGCCA6TGG 900 

CTCGAGCGCC ATCTGT6GGC AGOCTCAGCC TCAGCCTGGC TGACTCGGGC CACCTGCOGG 960 

ACGTGCATGG GTTCAACAGC TACGGTAGCC ACOGAACCXT GCAGA6ACTC AGCAGOGGTT 1020 

TTGATGACAT TGACCTGCCC TCAGCAGTCA AGTACCTCAT GGCTTCAGAC CCCAACCTGC 1080 

AGGTGCTGGG AGOSGCCTAC ATCX:aGCACA AGTGCTACAG CX3ATGCAGCC GCCAAGAAGC 1140 

AGGCCCGCAG CCTTCAGGCC GTGCCTAGGC TGGTGAAGCT CTTCAACCAC GCCAACCAGG 1200 

AAGTGCAGOG GCATGCCACA GGT6CCAT6C GCAACCTCAT CTACGACAAC GCTGACAACA 1260 

AGCTGGCCCT G6TGGAGGAG AAOGGGATCT TCGAGCTGCT GOSGACACTG GGGGAGCAGO 1320 

ATGATGAGCT TCGCAAAAAT GTCACAGGGA TCCTGTGGAA CCTTTCATCC AGCGACCACC 1380 

TGAAGGACCG CCTGGCCAGA GACACGCTGG AGCAGCTCAC GGACCTGGTG TTGAGCCCCC 1440 

TGTCGGGGGC TGGGGGTCCC CCCCTCATCC AGCAGAACGC CTCGGAGGCG GAGATCTTCT 1500 

ACAACGCCAC CGGCTTCCTC AGGAACCTCA GCTCAGCCTC TCAGGCCACT CGCCAGAAGA 1560 

TG06GGAGTG CCACGGGCXG GTGGACGCCC TGGTCACCTC TATCAACCAC GCCCTGGACG 1620 

CGGGCAAATG 06AGGACAAG AGCGTGGAGA ACGCGGTGTG OGTCCTGOGG AACCTGTCCT 1660 

ACCGCCTCTA CGAOGAGATG CCGCCX5TCCG CGCTGCAGCG GCTGGAGGGT CGCGGCCGCA 1740 

GGGACCTGGC GGGGGCGCCG CCGGGAGAGG TCGTGGGCTG CTTCACGCOG CAGAGCOGGC 1800 

GGCTGCGCGA GCTGCCCCTC GCCGCCGATG CGCTCACCTT CGCGGAGGTG TCCAAGGACC 1860 

CCAAGGGCCT OSAGTGGCTG TGGAGCCCCC AGATOGTGGG GCTGTACAAC CGGCTGCTGC 1920 

AGCGCTGCGA GCTCAACCGG CACACGACGG AGGGGGCOGC GGGGGOGCTG CAGAACATCA 1980 

CGGCAG60GA 0G6CAGGT6G GCGGGOGTGC TGAGCOSCCT GGCCCTGGAG CAOGAOOGTA 2040 

TTCTGAACCC CCTGCTAGAC CGTGTCAGGA CCGCC3GACCA CCACCAGCTG CGCTCACTGA 2100 

CTGGCCTCAT CCGAAACCTG TCTCGGAACG CTAGGAACAA GGAOGAGATG TCCACGAAGG 2160 

TGGTGAGCCA CCTGATCGAG AAGCTGCCAG GCAGCGTGGG TGAGAAGTOG CXXXXZAGCCXS 2220 

AGGTGCTGGT CAACATCATA GCTGTGCTCA ACAACCTGGT GGTGGCCAGC CCCATCGCTG 2280 

CCCGAGACCT GCTGTATTTT GACGGACTCC GAAAGCTCAT CTTCATCAAG AAGAAGCGGG 2340 

ACAGCXX:CGA CAGTGAGAAG TCCTCCCGGG CAGCATCCAG CCTCCTGGCC AACCTGTGGC 2400 

AGTACAACAA GCTCCACOGT GACTTTCXSGG CGAAGGGCTA TCGGAAGGAG GACTTCCTGG 2460 

GCCGATAGGT GAAGCCTTCT GGA6GAGAA6 GTGAO6T0GC CXZAGGQTCCA AGGGACAGAC 2520 

TCAGCTCCAG GCTGCTTGGC AGCCCAGCCT GGAGGAGAAG GCTAATGACG GAGGGGCCOC 2580 

TCGCTGGGGC CCXZTGTGTGC ATCTTTGAGG GTCXTCGGCC ACCAGGAGGG GCAGGGTCTT 2640 

ATAGCTGGGG ACTTGGCTTC CGCAGC3GCAG GGGGTGGGGC AGGGCTCAAG GCTGCTCTGG 2700 

TGTATGGGGT GGTGACCCAG TCACATTGGC AGAGGTGGGG GTTGGCTGTG GCCTGGCAGT 2760 

ATCTTGGGAT AGCCAGCACT GGGAATAAAG ATGGCCATGA ACAGTCACAA AAAAAAAAAA 2820 
AAAAGGAATT C 

seq ID KO: 351 Protein sequence 
Protein Accession S: NP_009114.1 

1 11 21 31 41 51 
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10 



15 



WO 02/086443 
I I I I 

MCKXSTFLLSA LQPBAGVCSl. ALPSOLQU)S RGABGPEAER 
PBHKGAAEPB PEA5TARGTS RGQYHTX^QAG FS5RSQGLSG 
SRSAVDLSCS SRLSSAHNGG SAFGAAGYGG AQPTPPKPTS 
LRSLRLG9GG LDDSYSLVSS QLEPAATSTY RAFAYERQAS 
RTIRAPAVRT LQRFQSSHRS RGVGGAVPGA VLEFVASAPS 
KSYGSHRTLQ RLSSGFDDID LPSAVKYLf<lA SDPNLQVLGA 
OAVPRLVKLF NHANQEVQRH ATGAMRNLIY DIIADNKLALV 
BKVTGZLHKL SSSDHLKDRL ARDTLEQXiTD LVLSPLSGAG 
FLRKLSSASQ ATRQKMRECH GLVDALVTSI NHALDAGKCE 
EMPPSALQRL EGRGRaOLAG APPGEWGCP TFQSRRLREti 
HLHSPQIVGL YNRLLQRCEL NRHTTEAAAG ALQNITAGDR 
LDRVRTADHH QLRSLTGLIR NIiSRKASNXD Q4STKWSHL 
IIAVliNNLW ASPIAARDLL YPDGLRKLIP ZKKKRDSFDS 
HROFRAKGYR KEOPLGP 



PCT/US02/12476 



LRAARVQEQV 
DKTSGFRPIA 



S5SSRAGGLD 
VRSLSLSLAD 
AYIOHKCYSD 
EENGIFELLR 
GPPLIQQNAS 
OKSVBiAVCV 
PLAADALTFA 
SKAGVLSRLA 
lEKLFGSVGB 



RARLLQU3QQ 
KFAYSPASWS 
GSSADYDTLS 
KPBATEV5PS 



AAAKKQARSL 
TLREQDDBLR 
EAEIFYNATG 
LRIILSYRLYD 
EVSKDPKSLB 
VBQESlVSBh 
KSPPAEVLVM 
LANLWQYtnOj 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



Seq ID NO: 352 DHA sequence 
Nucleic Acid Accession ft: M31469 
20 coding sequence: 1-651 



25 
30 
35 
40 
45 
SO 
55 
60 
65 
70 
75 
80 
85 



1 
I 

ATGGCTGOGC 
ACTGGAAAAA 
GCCACCTTGG 
TTCAATGTAT 
ATCCAAGCCC 
GTGCCTAACT 
GGCAACAAAG 
AAGAAGAATC 
TTCCTCTGGC 
GCTCTCGCCC 
TTAGAGGTTG 



11 
I 

AQGGAOAGCC 
CGACCTTCGT 

GTGTTGAGGT 
GGGACACAGC 
AGTGTGCCAT 
GGCATAGAGA 
TGGATATTAA 
TTCAGTACTA 
TTGCTAGGAA 
Cy^CCAGAAGT 
CTCAGACAAC 



21 
I 

GCAGGTCCA6 
GAAACGTCAT 
TCATOCCCTA 
OGGCCAGGAG 
CATAATGTTT 
TCTGGTACGA 
GGACAGGAAA 
OGACATTTCT 
GCTCATTGGA 
TGTCATGGAC 
TGCTCTCOCG 



31 
I 

TTCAAACTTG 
TTGACTGGTG 
GTGTTCCACA 
AAATTCGGTG 
GATGTAACAT 
GIGTGT6AAA 
GTGAAGGOGA 
GCCAAAAGTA 
GACCCTAACT 
CCAGCTTTGG 
GATGAGGATG 



41 
I 

TATTGGTTGQ 
AATTTGAGAA 

CCAACAGAGG 
GACTGAGAGA 
CGAGAGTTAC 
ACATCOCCAT 
AATCCATTGT 
ACTACAACTT 
TGGAATTTGT 
CAGCACAGTA 
ATGAOCTGTG 



51 
I 

TGATGGTGGT 
GAAGTATGTA 
ACCTATTAAG 
TCXXrrATTAT 
TTACAAGAAT 
TGTGTTGTGT 
CTTCCAC06A 
TGAAAAGCOC 
TGCCATGCCT 
TGAGCAOGAC 
A 



Seq ID NOs 353 Protein sequence 
Protein Accession ft: AAA36546 



11 



21 



31 41 51 

I 1 I 

MAAQGEPQVQ FKLVLVGDGG TGKTTFVKRH LTGBFEKKYV ATLGVEVHPL VFHTMRGPIK 
FMVNDTAGQB KFGGLRDGYY IQAQCAZIMF OVTSRVTYXN VPNHKRDLVR VCEKIPIVZX: 
GNKVDZKDRK VKAKSIVFHR KKNLQYYDIS AKSNYNFEKP FLHLARKLIG DPNLEFVAMP 
ALAPPEWMD PALAAQYEHD LEVAQTTALF DEDDDL 



Seq ID NO: 354 DNA sequence 
Nucleic Acid Accession 8: NM_002820 
Coding sequence: 304-831 



COGGTTCGCA 
CCCTGTTCCA 
OGTGTAAACA 
TTCAGAGGAA 
GTTTGGAGAA 
ACGATGCAGC 
GTGCCCTCCT 
GAACATCAGC 
CTTCACCATC 
CCTAACTCCA 
GAGGGCAGAT 
AAGACACCTG 
AAACGGCGAA 
GACCACCTGT 
CTGGCCCGTA 
GCrrGGACAA 
CAGAGAATAA 
TGTCCTCCAG 
CATCAATCCr 
ATCTTCATAA 
TTCTTCAGTG 
GATATTATCr 
ACTTTTTATT 
TAAATTATGT 
CCAGCTCATA 
GGTTTTTCTC 
COSTAGGAAA 



11 

1 

AAGAAGCTGA 
C6AACCCAGG 
O^TACTTAT 
GCGCCTCT6A 
AGCACAGTTG 
GGAGACTGGT 
GCGGGCGCTC 
TCCTCCATGA 
TGATOGCAGA 
AQCCCTCTCC 
ACCTAACTCA 
6GAAGAAAAA 
CTCGCTCTGC 
CTGACACCTC 
GCCTCAGCGG 
AOCTAGAATT 
CTCAGAATAT 
CACCATAGAG 
TTACCACTCT 
TTTGCTGGAG 
TTTTTCATTT 
ACAAACACTG 
TAATTAAATG 
TTTAAACACA 
CAAAATAAAT 
ATGTATCTTT 
AATAAAACTT 



21 
I 

CTTCAGAGGG 
AGAACTGCTG 
CATT6ATGCA 
TOViTA ' CrA ' 
GAGTAGCCGG 
TCAGCAGTGG 
GGTGGAGGGT 
CAAGGGGAAG 
AATCCACACA 
CAACACAAAG 
GGAAACTAAC 
GAAA6GCAAG 
CTGGTTAGAC 
CACAACGTOG 
GGTGCTCTCA 
TTCTOOCTTT 
TGTCTGCCTT 
AGGCGCTAGA 
ACCAAATAAT 
AAGTGTATTT 
CTTACGTTCT 
CAGAACAGCA 
TATTTAATTA 
TGGCTTAAAT 
GGTTTCTGAA 
TTGTTCATTO 
CACATTTAAA 



31 

1 

GGAAACTTTC 
GCCAGATTAA 
TATATAAAAC 
TTTTCCCTTT 
TTGCTAAATA 
AGCGTCGCGG 
CTCAGOCGCC 
TCCATCCAAG 
GCTGAAATCA 
AAGCACGCCG 
AAGGTG6AGA 
CCOGOGAAAC 
TCTGGAGTGA 
CTGGAGCTCG 
GCTGGGTTTT 
ATGTATCTCT 
AAAGCA6TAC 
GCCCATTCCT 
TTCATATTCA 
CTTCCXrCTTA 
TTCACTTCAA 
TCATGTCATA 
AATCTCAAAT 
TTGTTTAATT 
AATGTTTAAG 
GCAAGATGAA 
AAAAA 



41 
1 

TTCTTTTAGG 
TTAGACATTG 
CATTTTATTT 

TTGCTcrrrc 

AGTCCCGAGC 
TGTTCCTGCT 
GCCTCAAAAG 
ATTTAOGGCG 
GAGCTACCTC 
TCCGATTTGG 
OGTACAAAGA 
GCAA6GAGCA 
CTGGGAGTGG 
ATTCACGGTA 
GGAGCCTCCC 
ATGGATTGTG 
CCCOCTACCA 
CTTTCTCCAC 
AGCTTCAGAA 
CTCTCACACC 
GGGAGAATAT 
AAOGATTCTG 
TTATTTTAAT 
AAATTTAACT 
TATTAACTTA 
ATAATTTTTC 



51 
1 

AGGCGGTTAG 
CTATGGGAGA 
TOGCTATTAT 
TGGCTGTGTG 
GCGAGOGGAG 
GAGCTACGCX5 
AGCTGTGTCT 
AOGATTCTTC 
GGAGGTGTCC 
GTCTGATGAT 
GCAGCOGCTC 
GGAAAAGAAA 
GCTAGAAGGG 
ACAGGCTTCT 
TTCTGCCTTG 
TAGCAATTGA 
CACACACCCC 
CGTCACCCAA 
GCTAGTGACC 
TGGGCAAACT 
AGAAGCATTT 
AGCCATTCAC 
GTAAAGAACT 
CTG6TTTCTA 
GAAG6ATATA 
TAG6GTAAT6 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
760 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
IS 00 
1560 



Seq ZD KO: 355 Protein sequence 
Protein Accession ft: KM_002e20 

1 11 21 31 41 51 

11)11! 
MQRRLVQQHS VAVFLLSYAV PS06RSVEGL SRRLKRAVSE HQLLHDKGKS IQDLRSRFFL 
HHUAEIRTA EIRATSEVSP NSKPSPNTKK HPVRFGSDDE GRYIiTQETKK VETYKEQPIiK 



60 
120 



319 



wo 02/086443 

TFGKKXKGSF GXSSEQ&KKR RRTSSAHLDS 



PCT/US02/12476 



GVTGSGLBCS) HLSOTSTTSL BU)SR 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID VSOz 356 DSA sequence 
KUcleic Acid Accession ft: KM_017S22 
Coding sequence: 1-2100 



1 
I 

ATGGGGCTCC 
CTGCTGCTGC 
CAAGGGCGGG 
CCCTCTGTGT 
TGCCCCAAGA 
GAAOGGTGGA 
ACTTGCACCA 
TGTGTACCT6 
GCOGGCTGTG 
ACATGTGTCC 
GAAGCTGGCT 
ATCTGCACTG 
GMXAGAAGA 
TGTGTCAATT 
CTGACCAAGA 
AOGAGTGCGG 
AATGTCGTGG 
TACCGTAAGA 
CTCATTGACG 
ATCTACTGGA 
C6A0GCACTC 
OGAGGGTTCA 
AACGGTGTGG 
CTGGATCTGC 
ATTGACTTCA 
CCTTTTGGGA 
ATTTTCAGTG 
AACCCACATG 
GAGCTGAGTG 
TCCAGCCACT 
ATGAAGAGGT 
GCTGTTATCG 
CTGATCTGGA 
TACAGGAAAA 
ATTGGCCATG 
GGATGGGATC 
GTGTATGACT 
GATTTTTTTT 
ACATCCAAAG 
GCACTACCCA 
AATGGGGGCC 
TTTGGTTGCT 
ATGCTTTGTG 



11 
1 

C0GAGCX3GGG 
TGCGGCTCCA 

cx:aaggagtg 
ggagatgcga 
agacctgtgc 
agtgtgacgg 
a6caggtgtg 

CCTOGTGGCG 
CTACCTCACT 

TTGCAATCAA 
GCCTACAGGG 
ACCTCAAGAT 
CTTGTGGCGA 
ACAAGGGCTA 
ACTGCAAGGC 
AGGATOGACC 
CACTAGATGT 
TCTATAGC3GC 
AGCAGTTGCA 
CTGACTCGGG 
TCTTCAGCOG 
TGTATTGGTC 
ACOGGCAAAC 
TGAGCCAGCG 
GTGGAGGCAA 
TAGCTGTGTT 
CAAATCGGCT 
ACATTGTCAT 
TCCAGCCTAA 
CTCCCAAGTA 
GCTACXrGAGA 
GGATCATCGT 
GAAACTGGAA 
CAACAGAAGA 
TCTATCCTGC 
ACCCCCTTCG 
GGATGAATGG 
TTTAAATTTA 
GATGTGAGAG 
TGAGGAATTC 
AATGGCACAG 
GGGGGGCTTT 
GCTATCCATC 



21 
I 

COCTCTCCXjG 
GCATCTTGOG 
OGAAAAGGAC 
CGAGGAOGAT 
AGACAGTGAC 
OGAGGAGGAG 
TCCTGCAGAG 
CTGCGAOGGG 
GGGCACCTGC 
GCACTGCAAC 
GCTGAACGAG 
TGGCTTTGAA 
CATTGATGAG 
TTTTAAGTGT 
TGCTGCTGGC 
TGTGAAGCGG 
GGAAGTTGCC 
CTACATGGAC 
CTCTCCAGAG 
CAATAAGACC 
TAAGCTCAGT 
TGACTGGGGG 
ACTGGTGTCA 
CTTGTACTGG 
CAGAAAGACG 
TGAGGACMG 
CAATGGCCTG 
CTTCCATGAG 
TGGAGGCTGT 
CACATGTGCC 
TGCAAATGAA 
GCCCATAGTG 
GCGGAAGAAC 
AGAAGATGAA 
ACGAGTGGCA 
TGCCTCATGG 
GTTTCTATAT 
TGTTGCGGAA 
TTTTTCTATG 
GTGGAATGGC 
TACXTTTACTC 
TTTAGGTTTT 
AACATAAGT 



31 
1 

CTTCTGGCGC 
GCGGCAGCGG 
CAATTCCAGT 
GACTGCTTAG 
TTCa^CCTGTG 
TGTCCTGATG 
AAGCTGAGCT 
GAGAAGGACT 
0GV6GGGACG 
CAGGAGCAGG 
TGTCTGCACA 
TGCACGTGCC 
TGCAAGGACC 
GAGT6CTAOC 
AAGAGCCCAT 
AACTATTCAC 
ACCAA70GCA 
AAGGCCAGTG 
GGCCTGGCAG 
ATCTCA6TGG 
GAACXXXX30G 
GACCAQGCCA 
GACAATATTG 
GTAGACTCCA 
CTGATCTCCT 
GTGTTCTGGA 
GAAATCTCCA 
CTGAAGCAGC 
GAATACCTGT 
TGTCCTGACA 
GACAGTAAGA 
GTGATAGCCC 
ACCAAAAGCA 
GATGAGCTCC 
TTAAGOCTTG 
AATTCAGTCC 
ATGGGTCTGT 
AGGTAACCAC 
TATAATGTTT 
TACrCCTGAC 
ATCATTTAAA 
GGCCATTTGT 



41 

1 

TGCTGCTGCT 
CTGATC06CT 
GCCGGAAOGA 
AOCACAGGGA 
ACAAOGGCCA 
GCTCCGATGA 
GTGGAC^AC 
GGQAGGGTGG 
AGTTOCAGTG 
ACTGTCCAGA 
ACAATGGOOG 
CAGCAGGCTT 
CAGATGCCTG 
CTGGCTGOGA 
CCCTAATCTT 
GCCICATCCC 
TCTACTGGTG 
ACCCGAAAGA 
TGGACTGGGT 
CCACAGTTGA 
CCATGGCTGT 
AGATTGAGAA 
AATGGCCCAA 
AGCTACACCA 
CXa^CTGACTT 
CaGACCTGGA 
TCCTGGCTGA 
CAAGAGCTCC 
GOCTTCCTGC 
CAATGTGGCT 
TGGGCTCAAC 
TCCTGTGCAT 
TGAATTTTGA 
ATATAGGGAG 
AAGAtGATGG 
CATGCACTAC 
GTX5AGTGTAT 
AAAGTTATGA 
TATACACTTT 
TAACATGATG 
AACTATATTT 
TTTTTGTAAA 



51 
I 

GCTGCTGCTG 
GCTCGGOGGC 
GCGCTGCATC 
CGAGGACGAC 
CTGCATCCAC 
GTCOGAGGCC 
CAGCCACAAG 
AGGGQATGAG 
T6GGGATGGG 
TGGGAGTGAT 
CTGCTCACAC 
CCAGCTCCTG 
CAGCCAGATC 
GATGGA<XTA 
CACCAACOGC 
CATGCTCAAG 
TGACCTCTCC 
GOGGGAGGTC 
CX3VCAAGCAC 
TGGTGGCCGC 
TGACCCCCTG 
ATCTGGGCTC 
CGGAATCACC 
ACTGTCCAGC 
CCTGAGCCAC 
GAACGAGGCC 
GAACCTCAAC 
AGATGCCTGT 
TCCTCAGATC 
GGGTCCAGAC 
AGTCACTGCC 
GAGTGGATAC 
CAACCCAGTC 
AACTGCTCAG 
ACTACCCTGA 
ACTCCGGATG 
GTGTGTGTGT 
TGAACTGCAA 
TTAACTGGTT 
CACATAACCA 
ACAGAAGATG 
TAAGATGATT 



Seq ID NO: 357 Protein sequence 
Protein Accession #: NP 059992 



11 



21 



31 



41 



51 



KGLPBPGPZ/R 
PSVWRCOEDD 
TCTKQVCPAB 
TCVLAIKKCEf 
DQKTCGDIDE 
TSAEDRPVKR 
LIDEQLHSPB 
RGPMYWSDWG 
IDPSGGNRKT 
NPHOIVIFHE 
MKRCYRDANE 
YRKTTEEEDE 



LLAT.TiT.LLLL 
DCLDHSDEDD 
KLSCGPTSHK 
QEQDCPOGSO 
CXDPDACSQI 
NYSRIilPMLK 
GLAVDWVHKH 
DQAKIEKSGL 
LISSTDFLSH 
LKQPRAPDAC 
DSKMGSTVTA 
DBLHIGRTAQ 



LLLLRLQHLA 
CPKKTCAOSD 
CVPASWRCDG 
EAGCLQGLNE 
CVNYKGYPKC 
NWALDVEVA 
lYWTDSGPdCT 
NGVDRQTLVS 
PFGIAVFEDK 
Bi*SVQPKGGC 
AVIGIIVPIV 
IGHVYPARVA 



AAAAOPLLGG 
PTCDNGHCIH 
EKDCEGGADE 
CLHNNGGCSH 
ECYPGCEMDL 
TNRIYWCDLS 
ISVATVDGGR 
DNIEWPNGIT 
VFMTDLENEA 
EYLCLPAPQI 
VIALLCM5GY 
LSLEDDGLP 



OGPAKECEKD 
SRWKCDGEEE 
AGCATSLGTC 
ICTDLKIGFE 
LTKNCKAAAG 
YRKZYSAYMD 
RRTLFSRNLS 
LDLLSQRLYW 
IPSANRLNGL 
SSHSPKYTCA 
LIWRNHKRKN 



€0 
120 
180 
240 
300 
360 
420 
480 
540 
€00 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
ISOO 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 



QFOCRNBRCI 

CPDGSDESEA 

RGDEFQCGDG 

CTCPAGFQLL 

KSPSLIPTNR 

KASDPKEREV 

EPRAIAVDPLtt 

VDSKLHQLSS 

EISILAEtTLM 

CPDTMWLGPD 

TK5HNFDNPV 



Seq ID NO: 358 DNA sequence 
Nucleic Acid Accession ft: M27a26 
Coding sequence: <1-S03 



AGCCCAAGAA 
GACGCTGCCC 
CTCTCACAGT 
CGTTGCCTTC 
GGCAAGCTTC 
TTATGCACTC 
TAACCAAATT 
TTCTTCCCAA 
CACAAGACCT 
CACTTGAAGC 
TTOGCOGCTC 



11 
t 

ACATCTCACC 
GATOGCCTCG 
GGAGGGTAAG 
TTTTCAAGGG 
AAAACCCCTG 
TTTTTTAGTT 
ATCTGCTTCC 
TCCAAAGCCT 
CCCTTCAGCT 
AGCCCTGAGA 
CAACACTTCA 



21 
I 

AATTTCAAAT 
GAAGTCCCCT 
TCCATCCCCT 
CCTGTTTCCC 
AAAACTGCCX: 
ATCCCCACCT 
CTGACTATTC 
CCTTTGTCTC 
TAATCTCTCC 
AACATOGGCC 
ACACTATTTT 



31 
I 

CTGATCTATT 
GGACCATCAC 
GTTTAATCGA 
TTGCCCCCAT 
CACTCTGGTG 
GCCCACTTCC 
CTGGAGTACA 
CTCTAACATC 
CACTCTAGGT 
ATTCTCTCTC 
GTTTTATITG 



41 
1 

CGGCTTAGCG 
AGAAGCCGAG 
TACGGGGGCT 
AACTGTTGTG 
CCMCTTGGA 
CTTATTAGGC 
GCTACATCTC 
CCCACAATAT 
TCCCACGC06 
CATACCACCC 
TCTTATTAAT 



51 
I 

ACTGAAGATT 
CTTCGGGTAA 
ACCCACTCCA 
GGTATTGACG 
CAACACTCTT 
OGAAATATTT 
ATTGCTGCOC 
CAGCCCTTAC 
CCCCTAATCC 
CrCAAAAATT 
ATCAGAAGGC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
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ACGAATGTCA GGCCTCTGAG CCCAGGCCaC GCCATOGCAT CCCCTGTGAC TTGCAOSTAT 720 

ACATOAGAT GGCCTGAAGT AACTGAAGAT CCACAAAABA AGTAAAAACA GCC TTAACTO 780 

ATGACATTCC ACCATTGT6A • mXirnXHM CCCCRCCCIA ACTGKrCftAT GTACTTTOTA 840 

ATCrOOCCCA CCCTTAAGAA GGTTCTTTGT AATTCTCCCC ACCCTT GftGA ATUTA Cll IG 900 

TGAGATCCAC CCCTGCCCAC CAGAGAACAA CCOCCTTTGA TTGTAATTTT TTATTAOCTT 960 

CCCAAATCCT ATAAAACAGC CCCACCCCTA TCTTCCTTCA CTGACTCTCT TTTCGGACTC 1020 
AGGCAOQGGC AOCCAGGTGA AATAAACAGC TTIATTSCTC AC 

Seq ID KO: 3S9 Procein sequence 
Protein Accession g: AAA6S999 

1 11 21 31 41 51 

1 I i ) I I 

PKXKLTMFKS DLPGLATEDW RCPIASEVPW TITEAEliRVT LTVEGKSIPC LIDTGATEST 60 

LPSFQGPVSL APITWGIDG QASKPUCTPP LKOQLGQHSF MHSFLVIPTC PLPUiGSHIL 120 
TKLSASLTIP GVOUILIAAI. I«PHPKPPLCP LTSPQYQPIiP QDLPSA 

Seq ID NO: 360 DMA sequence 
Kucleic Acid Accession #: nm_001854 
Coding sequence: 162-5582 

1 11 21 31 41 51 

I 1 I I I 1 

AACCATCAAA TTTAGAAGAA AAAGCCCTTT GACTTTTTCC CCCTCTCCCT CCCCAATGGC 60 

TGTGTAGCAA ACATCCCTGG CGATACCTTG GAAAGGACGA AGTTGGTCTG CAGTCGCAAT 120' 

TTGGTGGCTT GAGTTCACAG TTGTGAGTGC GGGGCTOGGA GATGGAGCOG TGGTCCTCTA 180 

6GTGGAAAAC GAAACGGTGG CTCTGGGATT TCACOGTAAC AACCCTCGCA TTGACCTTCC 240 

TCTTCCAAGC TAGAGAGGTC AGAGGAGCTG CTCCACrTGA TGTACTAAAA GCACTAGATT 300 

TTCACAATTC TCCAGAGGGA ATATCftAAAA CAAOGGGATT TTGCACftAAC AGAAAGAATT 360 

CTAAAGGCrC AGATACTGCT TACAGAGTTT CAAAGCAAGC ACAACTCA6T GCCCCAACAA 420 

AACAGTTATT TCCAGGTGGA ACTTTCCCAG AAGACTTTTC AATACTATTT ACAGTAAAAC 480 

CAAAAAAAGG AATTCAGTCT TTCCTTTTAT CTATATATAA TGAGCATGGT ATTCAGCAAA 540 

TTGGTGTTGA GGTTGGGAGA TCACCTGTTT TTCTGTTTGA AGACCACACT GGAAAACCTG 600 

CCCCAGAAGA CTATCCCCTC TTCAGAACTG TTAACATOGC TGAOGGGAAG TGGCATGGG6 660 

TAGCAATCAG CGTGGAGAAG AAAACTGTGA CAAT6ATTGT TGATTGTAAG AAGAAA ACCA 720 

CGAAACCACT TGATAGAAGT GAGAGAGCAA TTGTTGATAC CAATGGAATC ACGGTTTTTG 780 

GAACAAGGAT TTTGGATGAA GAAGTTTTTG AGGGGGACAT TCAGCAGTTT TTGATCACAG 840 

GTGATCCCAA GGCAGCATAT GACTACTGTG AGCATTATAG TCCAGACTGT GACTCTTCAG 900 

CACCCAAGGC TGCTCAAGCT CACGAACCTC AGATAGATGA GTATCCACCA GAGGATATAA 960 

TG6AATATGA CTATGAGTAT GGGGAAGCAG AGTATAAAGA GGCTGAAAGT GIAA CAGAG G 1020 

GACCCACTGT AACTGAGGA6 ACAATAGCAC AGAGGGAGGC AAACATCGTT GAT GATTT TC 1080 

AAGAATACAA CTATGGAACA ATGGAAAGTT AOCAGACI^ AGCTOCTAGG CATGTTTCTG 1140 

GGACAAATGA GCCAAATCCA GTTGAA6AAA TATTTACTGA AGAATATCTA AGGGGAGAGG 1200 

ATTATGATTC CCAGAGGAAA AATTCTGAGG ATACACTATA TGAAAACAAA GAAATAGACG 1250 

GCAGGGATTC TGATCTTCTG GTAGATGGAG ATTTAGGCGA ATATGATTTT TATGAATATA 1320 

AAGAATATGA AGATAAACCA ACAAGCCCCC CTAATGAAGA ATTTGGTCCA GGTGTACCAG 1380 

CAGAAACTGA TATTACAGAA ACAAGCATAA ATGGOCATGG TGCATATGGA GA6AAAGQAC 1440 

AGAAAGGAGA ACCAGCAGTG GTTGAGCCTG GTATGCTTGT CGAAG6ACCA CCAGGACCAG 1500 

CAGGACCTGC AGGTATTATG GGTCCTCCAG GTCTACAAGG CCCCACTGGA CCCCCTGGTG 1560 

ACCCTGGCGA TAGGGGCCCC CCAGGACGTC CTGGCTTACC AGGGGCTGAT GGTCTACCTG 1620 

GTCCTCCTGG TACTATGTTG ATGTTACCGT TCCGTTATGG TGGTGATGGT TCCAAAGGAC 1680 

CAACCATCTC TGCTCAGGAA GCTCAGGCTC AAGCTATTCT TCAGCAGGCT CGGATTGCTC 1740 

TGAGAGGCCC ACCTGGCCCA ATGGGTCTAA CTGGAAGACC AGGTCCTGTG GGGGGGCCTG 1800 

GTTCATCTGG GGCCAAAGGT GAGA6TGGTG ATCCAGGTCC TCAGGGCCCT OGAGGOGTCC 1860 

AGGGTCCCCC TGGTCCAAC6 GGAAAACCTG GAAAAAGGGG TCG TCCA GOT GCAGATGGAG 1920 

GAAGAGGAAT GCCAGGAGAA CCTGGGGCAA A6GGAGAT08 AGGGTTTGAT GGACTTCCGG 1980 

GTCTGCCAGG TGACAAAGGT CACAGGGGTG AACGAGGTCC TCAAGGTCCT CCAGGTCCTC 2040 

CTGGTGATGA TGGAATGAGG GGAGAAGATG GAGAAATTGG ACCAAGAGGT CTTCCAGGTG 2100 

AAGCTGGCCC ACGAGGTTTG CTGGGTCCAA GGGGAACTCC AGGAGCTCCA GGGCAGCCTG 2160 

GTATGGCAGG TGTAGATGGC CCCCCAG6AC CAAAAGG6AA CATGGGTCCC CAAGGGGAGC 2220 

CTGGGCCTCC AOGTCAACAA GGQAATCCAG GACCTCAGGG TCTTCCTGGT CCACAA6GTC 2280 

CAATTGGTCC TCCTGGTGAA AAAGGACCAC AAGGAAAAOC AGGACTTGCT GGACTTCCTG 2340 

GTGCTGATGG GCCTCCTCGT CATCCTGGGA AAGAAGGCCA GTCTGGAGAA AACGGGGCTC 2400 

TGGGTCCCCC TGGTCCACAA GGTCCTATTG GATNNCCGGG CCCCOGGGGA GTAAAGGGAG 2460 

CAGATGGZGT CAGAGGTCTC AAGGGATCTA AAGGTGAAAA GGGTGAAGAT GGTTTTCCAG 2520 

GATTCAAAGG TGACATGGGT CTAAAAGGTG ACAGAGGAGA AGTTGGTCAA ATTGGCCCAA 2580 

GAGGGNAAGA TGGCCCTGAA GGACCCAAAG GTCGAGCAGG CCCAACTGGA GACCCAGGTC 2640 

CTTCAGGTCA AGCAGGAGAA AAGGGAAAAC TTGGAGTTOC AOGATTACCA GGATATCCAG 2700 

GAAGACAAGG TCCAAAGGGT TCCACTGGAT TCCCT6GGTT TCCAGGTGCC AATGGAGAGA 2760 

AAGGTGCACG GGGAGTAGCT GGCAAACCAG GCCCTCGGGG TCAGOGTGGT CCAACGGGTC 2820 

CTCGAGGTTC AAGAGGTGCA AGAGGTCCCA CTGGGAAACC TGGGCCAAAG GGCACTTCAG 2880 

GTGGCGATGG CCCTCCTGGC CCTCCAGGTG AAAGAGGTCC TCAAGGACCT CAGGGTCCAG 2940 

rrGGATTCCC T6GACCAAAA GGCCCTCCT6 GACCACCAGG AAGGATGGGC TGGCCAGGAC 3000 

ACCCTGGGCA A06TGGGGAG ACTGGATTTC AAGGCAA6AC OGGCCCTCCT GGGCCAGGG6 3060 

GAGTGGTTGG ACCACAGGGA CCAACCGGTG AGACTGGTCC AATAGGGGAA OGTGGGTATC 3120 

CTGGTCCTCC TGGCCCTCCT GGTGAGCAAG GTCTTCCTGG TGCTGCAGGA AAAGAAGGTG 3180 

CAAAGGGTGA TCCAGGTCCT CAAGGTATCT CAGGGAAAGA TGGACCAGCA GGATTACGTG 3240 

GTTTCCCAGG GGAAAGAGGT CTTCCTGGAG CTCAGGGTGC ACCTGGACTG AAAGGAGGGG 3300 

AAGGTCCCCA GGGCCCACCA GGTCCAGTTG GCTCACCAGG AGAAOGTGGG TCAGCAGGTA 3360 

CAGCTGGCCC AATTGGTTTA CGAGGGCGCC CGGGACCTCA GGGTCCTCCT GGTCCAGCTG 3420 

GAGAGAAAGG TGCTCCTGGA GAAAAAOGTC CCCAAGGGCC TGCAGGGAGA GATGGAGTTC 3480 

AAGGTCCTGT TGGTCTCCCA GGGCCAGCTG GTCCTGCCGG CTCCOCTGGG GAAGAOGGAG 3540 

ACAAGGGTGA AATTGGTGAG COGGGACAAA AAGGCAGCAA GGGTGGCAAG GGAGAAAATG 3600 

GCCCTCCCGG TCCCCCAGGT CTTCAAGGAC CACnGGTGC CCCTGGAATT GCTGGAGGTG 3660 

ATGGTGAACC AGGTCCTAGA GGACAGCAGG GGAT6TTTGG GCAAAAAGGT GATGAGGGTG 3720 

CCAGAGGCTT COCTGGAOCT CCTGGTCCAA TAGGTCTTCA GGGTCTGCCA GGCCCACCTG 3780 

GTGAAAAAGG TGAAAATOGG GATGTTGGTC CATGGGGGCC ACC7GGTCCT CCAGGCCCAA 3840 
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GAGGCCCTCA AGGTOCCAAT GGAGCTGATG GACX3iCAM5G PJCCCCCMXTI TCIVITGGTT 3900 

CAGTTCGTGG TGTTG6AGAA AAGGGTGAAC CTQGSCAAGC AGGAAACCCA GGGCCTCCTG 3960 

GGGAAGCAGG TGTAGGOGGT CCCAAAGOAG AAAGAOGAGA GAAAGGGGAA GCTG GTCCA C 4020 

CTGGAGCTGC TGGACCTCCA GGTGCCAAQG GGOOGCCAGG TGATGATGGC CCTAAGGGTA 4080 

ACCOGGGTCC TCTTGGTTTT CCTGGAGATC CTGGTCCTOC TGGGGAACTT GGCCCTGC3«5 4140 

GTCAAGATCG TGTTGGTGGT GACAAGGGTG AAGATGGAGA TCCTGGTCAA CCGGGTCCTC 4200 

CTGGCCCATC TGGTGAGGCT GGCCCACCAG GTCCTCCTGG AAAACGAGGT CCTCCTGGAG 4260 

CTGCAGGTGC AGRfiGGRAGA CAAGGTGAAA AAGGTGCTAA GGGGGAAGCA GGTGCAGAAG 4320 

GTCCTCCTOG AAAAACCGGC CCAGTOGGTC CTCAGGGACC TGCAGGAAAG CCTGGTCCAG 4380 

AAGGTCTTCG GGGCATCCCT GGTCCTGTGG GAGAACAAGG TCTCCCTGGA GCTGCAGGCC 4440 

AAGATGGACC ACCTGGTCCT ATGGGACCTC CTGGCITACC TGGTCTCAAA GGTGAOCCTG 4500 

GCTCCAAGGG TGAAAAGGGA CATCCTGGTT TAATTGGCCT GATTGGTCCT CCAGGAGAAC 4560 

AACGGGAAAA AGGTCACCGA GGGCTCCCTG GAACTCAAGG ATCTCCAGGA GCAAAAGGGG 4620 

A-EGGGGGAAT TCCTCGTCCT GCTGGTCCCT TAGGTCCACC TGGTCCTCCA GGCTTACCAG 4680 

GTCCTCAAGG CCCAAAGGGT AACAAAGGCT CTACTGGACC CGCTTOOCAG AAA GGTGA CA 4*740 

GTCGTCTTCC AGGGCCTCCT GGGCCTOCAG GTOCACCTGG TGAAGICATT CAGCCTTTAC 4800 

CAATCTTGTC CTCCAAAAAA A06ACAA6AC ATACTGAAGG CA3GCAAGCA GATGCAQATG 4860 

ATAATATTCT TGATTACTCG GATGGAATGG AAGAAATATT TGCrTCCCTC AATTCCCTGA 4920 

AACAAGACAT OGAGCATATG AAATTTCCAA TGGGTACTCA GACCAATCCA GCCCGAACTT 4980 

GTAAAGACCT GCAACTCAGC CATCCTGACT TCCCAGATGG TGAATATTGG ATTGATCCTA 5040 

ACCAAGGTTC CTCAGGAGAT TCCTTCAAAG TTTACTCTAA TTTCACATCT GGTGGTGAGA 5100 

CrrcCATTTA TCCAGACAAA AAATCTGAGG GAGTAAGAAT TTCATCATGG CCAAAGGAGA 5160 

AACCAGGAAG TTGGTTTAGT GAATTTAAGA GGGGAAAACT GCTTTCATAC TTAGATGTTG 5220 

AAGGAAATTC CATCAATATG GTGCAAATGA CATTCCTGAA ACTTCTGACT GCCTCTGCTC 5280 

GGCAAAATTT CACCTACCAC TGTCATCAGT CAGCAGCCTG GTATGATGTG TCATCAGGAA 5340 

GTTATGACAA AGCACTTCGC TTCCTGGGAT CAAATGATGA GGAGATGTCC TATGACAATA 5400 

ATCCTTTTAT CAAAACACTG TATGATGGTT GTACGTCCAG AAAAGGCTAT GAAAAAACTG 5460 

TCATTGAAAT CAATACACCA AAAATTGAtC AAGTACCTAT TGTT GATGTC ATGAYCA GTG 5S20 

ACTTTGGTGA TCAGAATCA6 AAGTTCGGAT TTGAAGTTGG TOCTGTTTGT TTTCTTGGCT 5580 

AAGATTAAGA CAAAGAACAT ATCAAATCAA CAGAAAATGT ACCTTGGT6C CACCAACCCA 5640 

TTTTGTGCCA CATGCAAGTT TTGAATAAGG ATGTATGGAA AACAACGCTG CATATACAGG 5700 

TACCATTTAG GAAATACCGA TGCCTTTGTG GGGGCAGAAT CACAGACAAA AGCTTTGAAA 5760 

ATCATAAAGA TATAAGTTGG TGTGGCTAAG ATGGAAACAG GGCTGATTCT TGATTOXAA 5820 

TTCTCAACTC TCCTTTTCCT ATTTGAATTT CTTTQGTGCT GTA(3AAAACA AAAAAAGAAA 5880 

AATATATATT CATAAAAAAT ATGGTGCTCA TTCTCATCCA TCCAGGATGT ACTAAAACAG 5940 

TGTCTTTAAT AAATTGTAAT TATTTTGTGT ACAGTTCTAT ACTGTTATCT GTGTCCATTT 6000 

CCAAAACTTG CACGTGTCCX: TGAATTCCGC TGACTCTAAT TTATGAGGAT GCOGAACTCT 6060 

GATGGCAATA ATATATGTAT TATGAAAATG AAG TTATG AT TTCCGATGAC CCTAAGTCXX: 6120 
TTTCrTTGGT TAATGATGAA ATTCCTTTGT GTOTGTTT 

Seg 10 NO: 361 Protein sequence 
Protein Accession ft: 1JP__00184S 

1 11 21 3t 41 51 

MEPWSSRWKT ICRWLWDPTVT TIALTFLFQA REVRGAAPVD VLKALDFHNS PEGISKTTGP 60 

CTNRKNSKGS DTAYRVSKQA QIiSAPTKQLP PGGTFPEDPS ILFTVKPKKG IQSFI.I1SIYN 120 

EHGIQQIGVB VGRSPVFLFE DHTGKPAPED YPLFRTVNIA DGKWHRVAIS VEKKTVTMIV 180 

DCKKKTTKPL DRSERAIVDT NGITVFGTRI LDEEVFEGDI QQPLITGDPK AAYDYCEHYS 240 

PDCDSSAPKA AQAQEPQIDE YAPEDIIEYD YEYGEAEYXE AESVTEGPTV TEETIAQTEA 300 

NIVDDFQEYN YGTMESYQTE APRHVSGTNE PNPVEEIPTE EYLTGEDYDS QRKNSEDTLY 360 

ENKEIDGRDS DLLVDGDLGE YDFYEYKEYE DKPTSPPIJEE FGPGVPAETD ITETSINGHG 420 

AYGEKGQKGE PAWEPGMLV EGPPGPAGPA GIMGPPGLQG PTGPPGDPGD RGPPGRPGLP 480 

GADGLPGPPG TMLMLPFRYG GDGSKGPTIS AQEAQAQAIL QQARIAIiRGP PGPMGLTGRP 540 

GFVGGPGSSG AKGESCa^FGP QGPRGVQGPP GPTGKPGKRG RPGADOGRGM PGEPGKBGDR €00 

GFDGLPGLPG DKGHRGERGP QGPPGPPGDD GMRGEDGEIG PRGIiPGEAGP RGLLGPRGTP 660 

GAPGQPGMAG VDGPPGPKGN MGPQGEPGPP GQQGNPGPQG I^POGPIGP PGEKGPQGKP 720 

GLAGLPGADG PPGHPGKEGQ SGEKGALGPP GPQGPIGXPG PRGVKGADGV RGLKGSKGEK 780 

GEDGFPGFKG OMGLKGDRGE VGQIGPRGXD GPEGPKGRAG PTGDPGPSGQ AGEKGKU3VP 840 

GLPGYPGRQG PKGSTGFPGF PGANGEKGAR GVAGKPGPRG QRGPTGPRGS RGARGPTGKP 900 

GPKGTSGGDG PPGPPGERGP QGPQGPVGPP GPKGPPGPPG RHGCP<2JPGQ RGETGFQGKT 960 

GPPGPGGWG PQGPTGETGP IGERGYPGPP GPPGBQGLPG AAGKBGAKGD PGPQGISGKD 1020 

GPAGLRGFPG ERGLPGAQGA PGLKGGEGPQ GPPGPVGSPG ERGSAGTAGP IGLRGRPGPQ 1080 

GPPGPAGEKG APGBKGPQGP AGRDGVQGPV GLPGPAGPAG SPGED<2)KGB I6EPGQKGSK 1140 

GGKGENGPPG PPGLQGPVGA PGIAGGDGEP GPRGQQGMFG QKGDEGARGF PGPPGPIGLQ 1200 

GLPGPPGEKG ENGDVGPWGP PGPPGPRGPQ GPNGADGPQG PPGSVGSVGG VGBKGEPGEA 1260 

GNPGPPGEAG VGGPKGERGE KGEAGPPGAA GPPGAKGPPG DDGPKGMPGP VGPPGDPGPP 1320 

GBLGFAGQDG VGGDK6EDGD PGQPGPPGPS GBAGPPGPPG K&GFPGAAGA EGRQGBKGAX 1380 

GBAGAEGPPG KTGPVGPQGP AGKPGPEGLR GIPGPVGEQ6 LPGAAGQDGP PGPMGPPGLP 1440 

GLKGDPGSKG EKGHPGLIGL IGPPGEQGEK GDRGLPGTQG SPGAKGDGGI PGPAGPLGPP 1500 

GPPGLPGPQG PKGNKGSTGP AGQKGDSGLP GPPGPPGPPG EVIQPLPILS SKKTRRHTEG 1560 

MQADADONIL DYSDGMEEIF GSLNSLKQDI EHMKFPMGTQ TNPARTCKDL QLSHPDFPDG 1620 

EYWIDPKQGC SGDSPKVYCM FTSGGETCIY POKKSEGVRI SSWPKEKPGS WFSEPKRGKL 1680 

LSYLDVEGNS INMVQMTFLK LLTASARQNF TYKCHQSAAW YDVSSGSYDK ALRFLGSNDE 1740 

EMSYDMNPFI KTLYDGCTSR KGYEKTVIEI NTPKIDQVPI VDVMISDFGD QNQKFGPEVG 1800 
PVCPJjG 



Seq ID NO: 362 DMA sequence 
Nucleic Acid Accession NM_003107 
Coding sequence: 351-1775 

1 11 21 31 41 51 

TTCCCCAGCA TTOGAGAAAC TCCTCTCTAC TTTAGCACGG TCTCX:aGACT CAGCOGAGAG €0 
ACAGCAAACT GCaWSCGCGGT GAGAGAGGGA GAGAGAGGGA GAGAGA GACT CTCCAGCCTG 120 
GGAACTATAA CrOCTCTGCG AGAGGCGGAG AACTCCTTCC OCAAATCTTT TGGGGACTTT 180 
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TCTCTCTTTA CCCACCTOOG OOCCTGOGAG GAGTTGA(3(36 GOCAGTTOGG COGCOGCGOG 240 

CGTCTTCCOO TTOSGCGTGT GCTTGGCCOG GGGAACC3GGG AGGGCCOJGC GATCGOGOGG 300 

CGGCOGCOGC GAGGGTGTGA GOGCGOGTGG GOSCCOGCCG AGCOGftGGOC ATGGTGCAGC 360 

AAACCAACAA TGCCGAGAAC AGGGAAGOGC TGCTGGCOGG OGAGAGCTOG GACTOSGGGG 420 

COGGOCIOGA GCTGGGAATC GCCTCCTCCC CCAOGCCOGG CTCCAOOGCC TCCAOGGGCG 480 

GCAAGGCCJGA OGACCOGAGC TGGTGCAA6A CCCC3GAGTGG GCACATCAAG CGACCCATGA 540 

ACGCCTTCAT GGTGTGGTCG CAGATGGAGC GGOGCAAGAT CATGGAGCAG TCGCCOGACA 600 

TGCACAACGC OGAGATCTCC AAGaSGCTGG GCAAAOGCTG GAAGCTGCTC AAAGACAGOG 660 

ACAAGATCCC TTTCATTOGA GAGGOGGAGC GGCTGOGOCT CAAGCACATG GCTGACTACC 720 

(XGACTACAA GTACCGGCCC AGGAAGAAGG TGAAGTCCGG CAACGCCAAC TCCAGCTCCT 780 

OGGCCGCCGC CTCCTCCAAG COGGGGGAGA AGGGAGACAA GGTCGGTGGC AGTGGCGGGG 840 

GOGGCCATGG GGGCJGGCGGC GGOGGOGGGA GCAGCAACGC GGGGGGAGGA GGCGGGGGTG 90O 

OGAGTGGCGG CGGOGCCAAC TCCAAACOGG GGCAGAAAAA GAfiCTG COGC TCCAAAGTGG 960 

CGGGCGGCGC GGGCGGTGG6 GTTAGCAAAC GGCAQGOCAA GCTCATOCTG GCAGGOGGOG 1020 

GCGGCGGCGG GAAAGCAGOG GCrGCX:GCCG CCGCCTCCTT GGCCGCOGAA CAGGOGGGGG 1080 

CCGCCGCCCT GCTCCCCCTG GGCGCCGCCG CCGACCACCA CTOGCTGTAC AAGGOGGGGA 1140 

CTCCCAGCGC CTCGGCCTCC GCCTCCTCGG CAGCCTCGGC CTCCGCAGCG CTCGCGGCCC 1200 

CGGGCAAGCA CCTGGGGGftG AAGAAGGTGA AGCGCGTCTA CCTGTTaSGC GGCCTGGGCA 1260 

CGTCGTCGTC GCCCGTGGGC GGOGTGGGCX; OGGGAGCCGA CCCCAGOGAC CCCCTGGGCC 1320 

TGTACGAGGA GGAGGGCGOS GGCTCCTCGC COGACGCGCC CAGCCTGAGC GGCCGCAGCA 1380 

GCGCCGCCTC GTCCCCCGCC GCOGGCOGCT 0GGCOGCC6A CCAOOGOQGC TA06GCAGCC 1440 

TGCGCGCCGC CTCGCCCGCC CCGTCCAGCG CGCCCTCGCA CGCGTCCTCC TOGGCCTCGT 1500 

CCCACTCCTC CTCrrCCTCC TCCTCGGGCT CCTCGTCCTC CGAOGACGAG TTCGAAGACG 1560 

ACCTGCTCGA CCTGAACCXX: AGCTCAAACT TTGAGAGCAT GTCCXTrGGGC AGCTTCAGTT 1620 

CGTCGTCGGC GCTCGACCXK3 GACCTGGATT TTAACTTOGA GCCOGGCTCC GGCTCGCACT 1680 

TOGAGTTCCC GGACTACTGC AOGCCOGAGG TGAGCGAGAT GATCTCGGGA GACTGGCTOG 1740 

AGTCCAGCAT CTCCAACCTG GTTTTCACCT ACTGAAGGGC GCXSCAGGCAG GGAGAAGGfSC 1800 

CGGGGGGGGT AGGAGAGGAG'aAAAAAAAAG TGAAAAAAAG AAAOGAAAAG GACAGAOGAA 1860 

GAGTTTAAAG AGAAAAGGGA AAAAAGAAAG AAAAAGTAAG CAGGGCTOGT TCGCCCGCGT 1920 

TCTGGTCX5TC GGATCAAG6A GCGCGGCGGC GTTTTGGACC CGCGCTCCCA TCCCCCACCT 1980 

TCCGGGGOOG GGGACCCACT CTGCCCAGCC GGAGGGAOGC GGAGGAGGAA GAGGGTAGAC 2040 

AGGGGOGACC TGTGATTGTT GTTATTGATG TTGTTGTTGA TGGCAAAAAA AAAAAGOGAC 2100 

TTCGAGTTTG CTCCCCTTTG CTTGAAGAGA CCCCCTCCOC CTTCCAACX3A GCTTCCX3GAC 2160 

TTGTCTGCAC CCCCAGCAAG AAGGCGAGTT AGTTTTCTAG AGACTTGAAG GAGTCTCCOC 2220 

CrrcCTGCAT CACCACCTTG GTTTTGTTTT ATTTTGCTTC TTGGTCAAGA AAGGAGOGGA 2280 

GAACXCAGCG CACCXXTCCC CCCCTTTTTT TAAAOGCGTG ATGAAGACAG AAGGCTCCGG 2340 

GGTGACGAAT TTGGCCGATG GCAGATGTTT TGGGGGAACG CCGGGA CTGA GAGACTCCAC 2400 

GCAGGCGAAT TCCCGTTTGG GGCCTTTTTT TCCTCCCTCT TTTCCCCTTG CCCCCTCTGC 2460 

AGCOGGAGGA GGAGATGTTG AGGGGAGGAG GCCAGCCAGT GTGACOGGOG CTAGGAAATG 2520 

ACCCGAGAAC CCOSTTGGAA GOSCAGCAGC GGGAGCTAGG GGCGGGGGCG GAGGAGGACA 2580 

CGAACTGGAA GGGGGTTCAC GGTCAAACTG AAATGGATTT GCACGTTGGG GAGCTGGCGG 2640 

CGGCGGCTGC TGGGCCTCCG CCTTCTTTTC TACGTGAAAT CAGTGAGGTG AGACTTCCCA 2700 

GACCCX3GGAG GCGTGGAGGA GAGGAGACTG TTTGATGTGG TACAGGGGCA GTCAGTGGAG 2760 
GGCGAGTGGT TT06GAAAAA AAAAAAGAAA AAAA0G6 



Seq 10 NO: 363 Protein sequence 
Protein Accession ft: NP_00309a 

1 11 21 31 41 51 

I 1 r 1 1 I 

MVQQTNNAEN TEALLAGESS DSGAGLELGI ASSPTPGSTA STGGKADDPS WCKTPSGHIK 60 

RPKNAFHVHS QZERRRIMBQ SPDMHHAEIS KRLGKRWKLZi XDSOKZPFIR EAERLRLKBM 120 

ADYPDYKYRP RKKVKS6NAN SSSSAAASSR PGEKGDKVGG SG0GGH6G0G GGGSSNAGGG 180 

GGGASGGGAN SKPAQKKSCG SKVAGGACGG VSKPHAKLIL AGOGGGGKAA AAAAASFAAB 240 

QAGAAALLPL GAAADHHSLY KARTPSASAS ASSAASASAA LAAPGKHLAB KRVKRVYLPG 300 

GLGTSSSPVG GVGAGADPSD PLGLYEEEGA GCSPDAPSLS GRSSAASSPA AGRSPADHRG 360 

YASLRAASPA PSSAPSKASS SASSHSSSSS SSGSSSSDDB FEDOLUJLNP SSNFESMSU3 420 
SFSSSSAIiDR DU3FNFEP6S GSHPSFPDYC TPEVSEMIS6 DHLESSISNL VFTY 



Seq ID NO: 364 DNA sequence 
Nucleic Acid Accession ff: U10860 
Coding sequence: 123-2204 

1 11 21 31 41 51 

I I 1 1 I i 

TGCCGGCTGC TCCTCGACCA GGCCTCCTTC TCAACCTCAG CCCGOGGOGC CGACCCTTCC 60 

GGCACCCTCC CGCCCCGTCT CGTACTGTOG CCGTCACCX5C OGCGGCTCCG GCCCTGGCCC 120 

CGATGGCTCT GTGCAACGGA GACTCCAAGC TGGAGAATGC TGGAGGAGAC CTTAAGGATG 180 

GCCACCACCA CTATGAAGGA GCTGTTGTCA TTCTGGATGC TGGTGCTCAG TACGGGAAAG 240 

TCATAGACOG AAGAGTGAGG GAACTGTTOG TGCAGTCTGA AATTTTCCCC TTGGAAACAC 300 

CAGCATTTGC TATAAAGGAA CAAGGATTCC GTGCrATTAT CATCTCTGGA GGACCTAATT 360 

CTGTGTATGC TGAAGATGCT CCCTGGTTTG ATCCAGCAAT ATTCACTATT GGCAA6CCT6 420 

TTCTTGGAAT TTGCTATGGT ATGCAGAT6A TGAATAAGGT ATTTGGAGGT ACTGTGCACA 480 

AAAAAAGTGT CAGAGAAGAT GGAGTTTTCA ACATTAGTGT GGATAATACA TGTTCATTAT 540 

TCAGGGGCCT TCAGAAGGAA GAAGTTGTTT TGCTTACACA TGGAGATAGT GTAGACAAAG 600 

TAGCTGATGG ATTCAAGGTT GTGGCACGTT CTGGAAACAT AGTACCAGGC ATAGCAAATG 660 

AATCTAAAAA GTTATATGGA GCACAGTTCC ACOCTGAAGT TGGOCTTACA GAAAATGGAA 720 

AAGTAATACT GAAGAATTTC CTTTATGATA TAGCTGGATG CAGTOGAACC TTCACCGTGC 780 

AGAACAGAGA ACTTGAGTGT ATTCGAGAGA TCAAAGAGAG AGTAGGCACG TCAAAAgTTT 840 

TGGTTTTACT CAGTGGTGGA GTAGACTCAA CAGTTTGTAC AGCTTTGCTA AATOGTGCTT 900 

TGAACCAAGA ACAAGTCATT GCTGTGCACA TTGATAATGG CTTTATGAGA AAACGA6AAA 960 

GCCAGTCTGT TGAAGAGGCC CTCAAAAAGC TTGGAATTCA GGTCAAAGTG ATAAATGCTQ 1020 

CTCATTCTTT CTACAATGGA ACAACAACCC TACX3VATATC AGATGAAGAT AGAACCCCAC 1080 

GGAAAAGAAT TAGCAAAAOG TTAAATATGA CCACAAGTCC T6AACAGAAA AGAAAAATCA 1140 

TTGGGGATAC TTTTGTTAAG ATTGCCAATG AAGTAATTGG AGAAAIGAAC TTGA AAOCAG 1200 

A0GAG6TTTT CCTTGCCCAA GGTACTTTAC GGOCTGATCT AATT6AAAGT GCATCCCTTG 1260 
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PCTAJS02/12476 



10 



15 



TTGCAAGTGG 
AGTTGAGAGA 
GAA7TTTGG6 
GTCCTGGCCT 
CTGAAACCAA 
ATACXCTATT 
AAATTACCAG 
AGGGTGACTG 
GGGAATCACT 
TTGTTTATAT 
TGACAACAGG 
GG6AGTCTGG 
TTGATOGGGA 
TTATTACTAG 
AGGTGGTATT 
ATGACTIAAC 



CAAAGCTGTUV 
GGAGGQAAAA 
CAGAGAACTT 
GGCAATCA6A 
CAATATTTTG 
ACAGAGAGTC 
TCTGCATTCA 
TOGTTOCTAC 
TATTTTTCTG 
ATTTGGCXXIA 
GGTGCTCAGT 
GTATGCTGGG 
CCCACTTCAA 
TGACTTCATG 
AAAGATGGTC 
ATCAAACCOC 



CTCATCAAAA 
GTAATAGAAC 
GGACrrOCAG 
GTAATATGtG 
AAAATAGTAG 
AAAGCCTGCA 
CTGAATGCCT 
AGTTACGTGT 
GCTAGGCTTA 
OCAGTTAAAG 
■ACTTTAOGCC 
AAAATCftSCC 
AAGCAGCCTT 
ACTGGTATAC 
ACTGAGATTA 
CCAGGAACTA 



CCCATCACAA 
CTCTGAAAGA 
AAGAGTTAGT 
CTGAAGAACC 
CTGATTTTTC 
CAACAGAAGA 
TCTTGCTGCC 
GTGGAATCTC 
TACCTCGCAT 
AAC5CTCCTAC 
AAGCTGATTT 
AGATGOCGGT 
CATGCCAGAG 
CTGCAACACC 
AGAAGATTCC 
CIGAGTGGGA 



TGACACAGAG 
TTTTCATAAA 
TTCCAG6CAT 
tTATATTTGT 
TGCAAGTGTT 
GGATCAOGAG 
AATTAAAACT 
CAGTAAAGAT 
CTGTCACAAC 
AGATGTTACT 
TGAGGCOCAT 
GATTTTGACA 
ATCTGTGGTT 
TGGCAATGAG 
TGGTATTTCT 
GTAATAAACT 



CTCATCACAA 
GATGAAGTGA 
CCATTTCCAG 
AAGGACTTTC 
AAAAAGCCAC 
AAGCTGATGC 
GTAGGTGTGC 
GAACCTOUiT 
GTTAACAGAG 
CCCACTTTCT 
AACATTCTCA 
OCATTACATT 
ATTCGAACCT 
ATCCCTGTAG 
OGAATTATGT 
TC 



1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 

laoo 

1860 
1920 
1980 
2040 
2100 
2160 



20 



25 



30 



35 



Seq ID HO I 365 Protein sequence 
Protein Accession I: AAA60331 



1 
I 

KAIXa?GDSKL 
AFAIKEQGPR 
KSVREDGVFN 
SKKLYGAQPH 
VLLSGGVDST 
HSPYNGTTTL 
GVPLAQGTLR 
ILGRELGLPE 
TliLQRVKACT 
ESLIFLARLI 
ESGYAGKISQ 
WLKMVTEIK 



11 
I 

EtlAGGDLKDG 
AIIISGGPNS 
ISVDNTCSLP 
PBVGLTENGK 
VCTALLNRAL 
PIS0E0R7PR 
PDLIESASLV 
ELVSRHPPPG 
TEEDQEKLHQ 
PRMCHNVNRV 
MPVILTPLHF 
KIPGISRIMY 



21 
I 

HHHYEGAWI 
VYAEDAPWFD 
RGUQKBEWIj 
VILKNFLYDI 
MQEQVIAVHI 
KRISKTLNMT 
AS6KABLIKT 
PGLAIRVICA 
ITSLHSLNAF 
VYIFGPPVKE 
DROPLQKQPS 
DLTSKPPGTT 



31 
I 

LDAGAQYGKV 
PAZFTZGKPV 
LTHGDSVDKV 
AGCSGTFTVQ 
DNGFKRKRES 
fSPEEKRKII 
KHMDTELIRK 
EEPYICKDFP 
LLPIKTVGVQ 
PPTDVTPTFL 
CQRSWIRTP 



41 
1 

IDRRVRKLFV 
liGICYGMOMM 
ADGFKWARS 
NRELECIREI 
QSVEEALKKL 
GDTFVKIANE 
LREEGBVIEP 
ETNNILKIVA 
GDCRSYSYVC 
TTGVLSTIiRQ 
ITSDFMTGIP 



51 
1 

QSBIFPLETP 
MRVFGGTVHX 
GHIVAGIAHE 
KERVGTSKVL 
GIQVKVINAA 
VIGEMMLKPE 
LKDFHKDSVR 
DPSASVKKPH 
GISSKDEPDW 
ADPEAH2JILR 
ATPGNEIPVE 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 



40 
45 
50 
55 
60 
65 



Seq 10 NO: 366 DNA sequence 
Nucleic Acid Acceseioo ft: NM_004219 
Coding sequence: 46-654 



GOGGCCTCAG 
TATGTTGATA 
CTGGGGTCTG 
TTTGGCAAAA 
ACTGT CAACA 
CCAAGCTTTT 
GCCTCAGATG 
GAGAGTTTTG 
CTCATGATCC 
CCTGTGAAGA 
CTGTCGACCC 
TAGTGCTTCA 
AAAAAAAA 



11 
I 

ATGAATGOGG 
AGGAAAATGG 
GACCTTCAAT 
CGTTCGATGC 
GAGCTACAGA 
CTGCCAAAAA 
ATGCCTATCC 
ACCTGCCTGA 
TTGACGAOGA 
TGCCCTCTCC 
TGGATGTTGA 
GAGTTTGTGT 



21 
I 

CTGTTAAGAC 
AGAACCAGGC 
CAAAGCCTTA 
CCCACCAGCC 
AAAGTCTGTA 
GATGACTGAG 
AGAAATAGAA 
AGAGCACCAG 
GAGAGAGCTT 
ACCATGGGAA 
ATTGCCACCT 
GTATTTGTAT 



31 
I 

CTGCAATAAT 

ACCCGTGTGG 
GATGGGAGAT 
TTACCTAAAG 
AAGACCAAGG 
AAGACTGTTA 
AAATTCTTTC 
ATT606CAOC 
GAAAAGCTGT 
TCCAATCTGT 
GTTTGCTGTG 
TAATAAAGCA 



41 

1 

CCAGAATG6C 
TTGCTAAGGA 
CTCAAGTTTC 
CTACTAGAAA 
GACCCCTCAA 
AAGCAAAAAG 
CCTTCAATCC 
TCOCCTTGAG 
TTCAGCTGGG 
TGCAGTCTCC 
ACATAGATAT 
TTCTTCAACA 



51 
1 

TACTCTGATC 
TGGGCTGAAG 
AACACCAOST 
GGCTTTGGGA 
ACAAAAACAG 
CTCTGTTCCT 
TCTAGACTTT 
TGGAGTGCCT 
CCCCCCTTCA 
TTCAAGCATT 
TTAAATTTCT 
GAAAAAAAAA 



Seq ID NO: 367 Protein sequence 
Protein Accession ft: NP_004210 

1 11 21 31 41 51 

i I 1 i } I 

MATLIYVDKE NGEPGTRWA KDGLKLGSGP SIKALOGRSQ VSTPRFGKTF DAPPALPKAT 
RKALGTVNRA TEKSVKTRGP LKQKQPSFSA KKMTEKTVKA KSSVPASDDA YPEIEKPPPP 
NPLDFESFDL PEEHQIAHLP LSGVPU4ILD EERELEKLFQ LGPPSPVKMP SPPWBSNIJiQ 
SPSSILSTLD VELPPVCCDI DI 



60 
120 
180 
240 
300 
360 
420 
480 
540 
-600 
660 
720 



60 
120 
180 



70 
75 
80 
85 



Seq ID NO: 368 DNA sequence 
Nucleic Acid Accession #: NM_000597 
Coding sequence: 118-1104 



ATTCGGGGOG 
CCT6CCCGCC 
CTGCGQAGAG 
COGCTGCTGC 
CTGTTCCGCT 
GCGCCGCCCG 
GTCCGGGAGC 
GGOGTCTACA 
CTGCCCCTGC 
TATGGCGCCA 
GTGGAGAACC 
AAGCCCCTCA 
CACCGGCAGA 
OGACCACCCC 



11 
1 

AGGGAGGAGG 
06CCCGCTCG 
TGGGCTGCCC 
TGCTGCTACT 
GCCCGCCCTG 
CCX3CGGTGGC 
CGGGCTGOGG 
CCCCGCGCTG 
AGGCGCTGGT 
GCCCGGAGCA 
AOGTGGACAG 
AGTCGGGTAT 
TGGGCAAGGG 
CT6CCAGGAC 



21 

! 

AAGAAGCGGA 
CTCGCTOGCC 
OGOGCTGCOG 
GGGCGCGAGT 
CACACCOGAG 
OGCAGTGGCC 
CTGCTGCTCG 
CGGCCAGG6G 
CATG6GCGAG 
GGTTGCAGAC 
CACCATGAAC 
GAAGGAGCTG 
TGGCAAGCAT 
TCCCTGCCAA 



31 
I 

GGAGGCGGCT 
GGCOGOGCCG 
CTGC0GCC6C 
GG0GGC6GGG 
06CCTGGCCG 
GGAGGCGCCC 
GTGT6CGCCC 
CTGCGCTGCT 
GGCACTTGTG 
AATGGCGATG 
ATGTTGGGCG 
GCCGTGTTCC 
CACCTTGGCC 
CAGGAACTGG 



41 

I 

CCCGCTCGCA 
CGCTGCC6AC 
OGOOGCTGCT 
GGG6GG0G0G 
CCTGCGGGCC 
GCATGCCATG 
GGCTGGAGGG 
ATCCCCACCC 
AGAAG06CCG 
ACCACTCAGA 
GGGGAGGCAG 
GGGAGAAGGT 
TGGAGGAGCC 
ACCAGGTCCT 



51 

1 

GGGCCXjTGCA 
CGCCAGCATG 
GCCGCTGCTG 
CGCGGAGGTG 
CCCGCCGGTT 
CGCGGAGCTC 
CGAG6GGTGC 
GGGCTCCGAG 
GGACQGOGAG 
AGGAGGOCTG 
TGCTGGCCGG 
CACTGAGCAG 
CAAGAAGCTG 
6GAGCG6ATC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
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TCCACCATGC GCCTTCCGGA TGAGCGGGGC 
CCCAACTGTG ACAAGCATGG CCTGTACAAC 
CAGOGTGGGG AGtGCTGGTG TGTGAACCCC 
ACCATCGQGG GQGACCCOGA GTCTCATCTC 
GTCCACACCC AfSCGGATGCA GTAGACCGCA 
GCXXCTCTCC AAACACCGGC AGAAAA.CGGA 
TTCCAiGTTCT GACACACGTA TTTATATTTG 
CCCGGOCTCT CTCTTCCCAG CTGCAGATGC 
GAGGAAGGGG GTTGTGGTOG GGG AGCTG GG 
TTTATTTTTG AACCCCTGTG TCCCTTTTGC 



CXrrCTGQAGC ACCTCTACTC CCTGCACATC 900 

CTCAAACAGT GCAAGATGTC TCTGAACGCSS 960 

AAGACCGG6A AGCTGATCCA GGGAGCCCCC 1020 

TTCIACAA3G AGCAGCAG6A GGCTTGCGGG 1060 

GCCAGCOGGT GCCTOGC3GCC CCTGCCCCCC 1140 

GAGTGCTTGG GTGCJreGCTG CTGGAGGATT 1200 

GAAAGAGACC AGCACCGAGC TCGGCACCTC 1260 

CACACXTGCT CCTTCTTGCT TTCCCCGGGG 1320 

GTACAGGTTT GGGGAGGGGG AAGAGAAATT 1380 
ATAMGATTAA AQGAAOGAAA AGT 



Seq ID KO: 369 Protein sequence 
Protein Accession «: NP_000589 

1 11 21 31 41 51 

I 1 I I 1 1 

MLPRVGCPAL PLPPPPLLPL LPLLLLLLGA SGGGGGASAE VZiFRCPPCTP ERLAACGPPP 60 
VAPPAAVAAV AGGASMPCAE LVREPGCGCC SVCARLEGEA OCSWTPROGQ GLRCYPEPGS 120 
ELPIiQAIjVKG EGTCEKRRDA EYGASPEQVA DIfGDDHSEGG LVBNHVDSTM NMI/3GGGSAG 180 
RKPIjKSGMKE LAVPRBKVTE QHRQXGKGGK HHLGLEEPKK LRPPPAHTPC QQEIiDQVXiER 240 
ISTMRLPDER GPLEHLYSLH IPNCDKHGLY NLKQCKMSUJ GQRGECWCVN PNTGKLIQGA 300 
PTIBGDPECH LFYMBQQEM: GVHTQRKQ 



Seq ID HO: 370 OIIA sequence 
Nucleic Acid Accession ft: im_004264 
Coding sequence: 6-440 

1 .11 21 ' 31 41 51 

•GQAACATGGC GGATOGGCTC ACGCAGCTTC AG6AGGCTGT GAATTOGCTT GCAGATCAGT 60 

TTTGTAATGC CATTGGAGTA TTGCAGCAAT GTGGTCCTCC TGCCTCTTTC AAT AATATT C 120 

AGACAGCAAT TAACAAAGAC CAGCCAGCTA ACCCTACAGA AGAGTATGCC CAGCTTTPTG 180 

CAGCACT6AT TGCACGAACA GCAAAAGACA TTGATGTTTT GATAGATTCX: TTACCCAGTG 240 

AAGAATCTAC AGCTGCTTTA CAGGCTGCTA GCTTGTATAA GCTAGAAGAA GAAAACCATG 300 

AAGCTCCTAC ATGTGTGGAG GATGTTGTTT ATCGAGGAGA CATGCTTCTG GAGAAGATAC 360 

AAAGOGCACT TGCTGATATT GCACAGTCAC AGCIGAAQAC AAGAAGTGGT AC CCATRGC C 420 

AGTCTCTTCC AGACTCATAG CATCAGTGGA TACC31XGTCG CTGA0AAAA6 AACTGTTTGA 480 

GTGCCATTAA GAATtCTGCA TCAGACTTAG ATACAAGCCT TACCAACAAT TACAGAAACA 540 

TTAAACACTA TGACACATTA CCTTTTTAGC TATTTTTAAT AGTCTTCTAT TTTC ACTCTT 600 

GATAAGCTTA TAAATCATGA TTGAATCAGC TTTAAAGCAT CATACCATCA TTTTTTAACT 660 

GAGTGAAATT ATTAAGGCAT GTAATACATT AATGAACATA ATATAAGGAA ACATA TGTAA 720 

AATTCTGTTA TGACATAATT TATGTCTCCA TTTK5TTGTA TTGGCCAGTA CTTTTACAAT 780 
C 



Seq ID NO: 371 Protein sequence 
Protein Accession ^: NP_004255 



1 11 21 31 41 51 

MADRIiTQLQD AVNSLADQFC NAIGVLQQCG PPASFHMIQT AINKDQPASP TBEYAQLFAA 60 
LIARTAKDID VLIDSLPSEE STAALQAASL YKLEEENHEA ATCVEDWVR GDMLLBKIQS 120 

AliADIAQSQL KTRSGTH5QS LPOS 



Seq ID NO; 372 DNA sequence 
nucleic Acid Accession ft: AJ271091 
Coding sequence: 1-1113 

1 11 21 31 41 SI 

I I I I i T 

ATGGftGAATC AGGTGTTGAC GOCGCATGTC TACTGGGCTC AG0GACACXX3 CGAGCTATAT 60 

CTGCGCGTGG AGCTGAGTGA CGTACAGAAC CCTGCCATCA GCATCACTGA AAACGTGCTG 120 

CATTTCAAAG CTCAAGGACA TGGTGCCAAA GGAGACAATG TCTATGAATT TCACCTGGAG 180 

TTCTTAGACC TTGTGAAACC AGAGCCTGTT TACAAACTQA CCCAQAGGCA GGTAAACATT 240 

ACAGTACAGA AGAAAGTGAG TCAGTGGTGG GAGAGACTCA CAAAGCAGGA AAAGCGACCA 300 

CTGTTTTTGG CTCCTGACTT TGATCGTTGG CTGGATGAAT CTGATGCGGA AATGGAGCTC 360 

AGAGCTAAGG AAGAAGAGCG CCTAAATAAA CTCOGACTGG AAAGCGAAGG CTCTCCTGAA 420 

ACTCTTACAA ACTTAAGGAA AGGATACCTG TTTATGTATA ATCTTGTGCA ATT CTTGGGA 480 

TTCTCCTGGA TCTTTGTCAA CCTGACTGTG OGATTCTGTA TCTTGGGAAA AGAGTOCTTT 540 

TATGACACAT TCCATACTGT GGCTGACATG ATGTATTTCT GCCAGATGCT GGCAGTTGTG 600 

GAAACTATCA ATGCAGCAAT TGGAGTCACT ACGTCACCGG TGCTGCCTTC TCTGATCCA6 660 

CTTCTTGGAA GAAATTTTAT TTTGTTTATC ATCTTTGGCA CCATGGAAGA AATGCAGAAC 720 

AAAGCTGTGG TTTTCTTTGT GTTTTATTTG TGGAGTGCAA TTGAAATTTT CAGGTACTCT 780 

TTCTACATGC TGACGTGCAT TGACATGGAT TGGAAGGTGC TCACATGGCT TCGTTACACT 840 

CTCTG6ATTC CCTTATATCC ACTGGGATGT TTGGOGGAAG CTGTCTCAGT GATTCAGTCC 900 

ATTCCAATAT TCAATGAGAC CGGACGATTC AGTTTCACAT TGCC ATATCC A GTGAA AATC 960 

AAAGTTAGAT TTTCCTTTTT TCTTCAGATT TATCTTATAA TOATATTTTT AGGTXTATAC 1020 

ATAAATTTTC GTCACCTTTA TAAACAGCGC AGftCIGAAAA TGAGGGCAGG CGCAGTGGCT 1080 
CATGCCTGTG ATCCCAGCGC TTTGGGAGGC TGA 

Seq ID NO: 373 Protein sequence 
Protein Accession ft: CAB69070 

1 11 21 31 

I 1 1 I 

MENQVLTPHV YHAQRHRBIiY LRVEL50VQN FAISITENVL 
FLDLVKPEPV YKLTQRQVNI TVQK2CVSQHU ESLTKQEKRP 
RAKEEERLNK LRLESS6SPE TLTNLRKGYL FMYNLVQPLG 



41 51 

I 1 

RFKAQGHGAK GDMVYEFHLE 60 
LFLAPDFDRH LDESOAEMEL 120 
PSWIFVNLTV RFCILGKESF 160 



325 
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YDTFHTVMW MYFCQMIAW ETIHAAIGVT TSPVLPSUQ LLGRNPILPI IFGIUBEXCf 240 
BWVPPWPH, HSAIBIFRYS PyMLTCIDMD HKWLTWLRYT LHIPLYPLQC tAEAVSVIOS 300 
IPIRIBIGRF SFTLPyCVKI XVRFSFFLQ! YLINIFLGLT IHFBBbYXOR BUMBAGAVA 360 
BkCOPSALGG 



PCTAJS02/12476 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 



Seq ID NO: 374 tsSA sequence 
Kuclelc Acid Accession «: KM_016395 
Coding sequence: 1-1113 



1 
I 

ATGGAGAATC 
CTGOGCGTGG 
CATTTCAAAG 
TTCTTAGACC 
ACAGTACAGA 

AGAGCTAAGG 
ACTCTTACAA 
TTCTCCTGGA 
TATGACACAT 
GAAACTATCA 
CTTCTTGGAA 
AAAGCTGT6G 
TTCTACATGC 
CTGTGGATTC 
ATTCCAATAT 
AAAGTTAGAT 
ATAAATTTTC 
CATGOCTGTG 



11 

I 

AGGTGTTGAC 
AGCTGAGTGA 
CrCAAGGACA 
TTGTGAAACC 
AGAAAGTGAG 
CTCCTGACTT 
AAGAAGAGCG 
ACTTAAGGAA 
TCTTTGTCAA 
TCCATACTGT 
ATGCAGCAAT 
GAAATTTTAT 
TTTTCTTTGT 
TGAOGTGCAT 
CCTTATATCC 
TCAATGAGAC 
TTTCCTTTTT 
GTCACCTTTA 
ATCCCAGCGC 



21 

1 

GCCGCATGTC 
CGTACAGAAC 
TGGTGCCAAA 
AGAGCCTGTT 
TCAGTGGTGG 
TGATCGTTGG 
CCTAAATAAA 
AGGAT ACCTG 
CCTGACTGT6 
GGCTGACATG 
TGGAGTCACT 
TTTGTTTATC 
GTTTTATTTG 
TGACATGGAT 
ACTGGGATGT 
CGGACGATTC 
TCTTCAGATT 
TAAACAGOGC 
TTTGGGAG6C 



31 
I 

TACTGGGCTC 
OCTGCCATCA 
GGAGACAATG 
TACAAACTGA 
GAGACSVCTCA 
CTGGATGAAT 
CKXGACTGG 
TTTATGTATA 
CGATTCTGTA 
ATGTATTTCT 
ACGTCACCGG 
ATCTTTGGCA 
TGGAGTGCAA 
TGGAAGGTGC 
TTGGCGGAAG 
AGTTTCACAT 
TATCTTATAA 
AGACTGAAAA 
TGA 



Seq ZD NO: 37S Procein sequence 
protein Accession NP_0S7479 



1 
I 

MEMQVLTPHV 
FLDLVKFEPV 
RAKEEERLNK 
YDTFHTVADM 
KAWFFVFYIi 
IPIFNETtORF 
STKKKDLDGF 



11 
I 

yWAQRHR&LY 
YKLTQRQVNI 
LRliBSEGSPE 
MYFCQMLAW 
WSAIEIFRYS 
SFTLPYPVKI 
LFV 



21 
1 

LRVELSDVQN 
TVQKKVSQWW 
TLTNIiRKGYL 
ETINAAIGV7 
FYMliTCZDMD 
KVRFSFFLQI 



31 

! 

PAISITENVL 
ERIiTKQEKRP 
FMYNLVOFLG 
TSPVLPSLIQ 
HKVLTHLRYT 
YLIMIFLGLY 



41 


51 




1 

AGOGACACOG 


1 

OGAGCTATAT 


60 


GCATCACTGA 


AAAOGTGCTG 


120 


TCTATGAATT 


TCACCTGGA6 


180 


COCAGAGGCA 


GGTAAACATT 


240 


CAAAGCAGGA 


AAAGCGACCA 


300 


CTGATGOGGA 


AATGGAGCTC 


360 




w X V« X X \ J«vl 


420 


ATCTTGTGCA 


ATTCTTGGGA 


480 


TCTTGGGAAA 


AGAGTCCTTT 


540 


GCCAGATGCT 


GGCAGTTGTG 


600 


TGCTGCCTTC 


TCTGATCCAG 


660 


CCATGGAAGA 


AATGCAGAAC 


720 


TTGAAATTTT 


CAGGTACTCT 


780 


TCACATGGCT 


TCGTTACACT 


840 


CTGTCTCAGT 


GATTCAGTCC 


900 


TGCCATATCC 


AGTGAAAATC 


960 


TGATATTTTT 


AGGTTTATAC 


1020 


TGAGGGCAGG 


CGCAiSTGGCT 


1080 


41 


51 




1 

HFKAQGKGAK 


1 

GDNVYEFHLE 


60 


LFLAPDFDRW 


LDESDAEMSL 


120 


FSWIFVNLTV 


RFCILGKESF 


180 


LIX3BNFILFI 


IFGTNEENQN 


240 


LHIFLYPLGC 


LVEAVSVIQS 


300 


INFRHLYKQR 


RRBYGKKRKR 


360 



Seq ZD NO: 376 DNA sequence 
Nucleic Acid Accession #: NM_005987 
Coding sequence: 1-270 



1 
I 

ATGAATTCTC 
GTGAAACAAC 
TGCCAACCCA 
ATTCCAGAGC 
CCAGCCCAGC 



11 



21 31 41 51 

I I I I 1 

AGCAGCAGAA GCAGCCTTGC ACCCCACCCC CTCAGCCTCA GCAGCAGCAG 
CTTGCCAGCC TCCACCCCAG GAACCATGCA TCCCCAAAAC CAAGGAGCCC 
AGGTGCCTGA GCCCTGCCAC CCCAAAGTGC CTGAGCCCTG CCAGCCCAA6 
CCTGCCAGCC CAAG6TGCCT GAGCCCTGCC CTTCAACGGT CACTCCAGCA 
AGAAGACCAA GCAGAAGTAA 



Seq ID NO: 377 Protein sequence 
Protein Accession ft: HP__005978 

1 11 21 31 41 51 

I 1 I 1 I 1 

MNSQQQKQPC TPPPQPQQQQ VKQPCQPPPQ EPCIPKTKEP CQPKVPEPCH PKVPSPOQPK 
IPEPOQPKVP BPCPSTVTPA PAQQKTXQK 



60 
120 
180 
240 



60 



70 
75 
80 
85 



Seq ID NO: 378 DMA sequence 
Nucleic Acid Accession «: NM_002105 
Coding sequence: 74-505 



1 
I 

ACAGCAGTTA 
CTACCTCGCT 
GTOGCGCTCG 
GAAGGGCCAC 
GGAGTACCTC 
GACX3CGAATC 
GCTGCTGGGC 
GCTGCCCAAG 
CACCCAGGCC 
CATGCCACCA 
CTTCAGACTG 
TCGCCGCCCG 
CGGCCTCGGG 
CGGCTTGGGC 



11 
I 

CACTGOGGCG 
AGCATGTCGG 
TCGCGCGCCG 
TACGCCGAGC 
ACCGCTGAGA 
ATCCCCCGCC 
GGCGTGAGGA 
AAGACCAGCG 
TCCCAGGAGT 
CAAAGGCCCT 
CGGGGCAAGC 
GCCTOGAGTC 
CCTGOCCTGT 
GGTCrrCGGG 



21 
I 

GGG6TCTGTT 
GCG60GGCAA 
GCCrCCAGTT 
GCGTTGGCGC 
TCCTGGAGCT 
ACCTGCAGCT 
TCGCCCAGGG 
CCAC06TGG6 
ACTAAGAGGG 
TTTAAGGGCC 
GGGCCGCGGC 
CCCGCCCGCC 
^GCCGTCCG 
GACCTCOGTG 



31 
I 

CTAGTGTTTG 
GACTGGGGGC 
CCCAGTGGGC 
CGGCGCGCCA 
GGCGGGCAAT 
GGCCATCOGC 
AGGCGTCCTG 
GCCGAAGGCG 
CCCGCGCOGC 
ACCACCGCCC 
TCCXTTCCCC 
CCCGCTCCCG 
CCCTCCGGTA 
GG6CGC3AAGA 



41 

I 

AGCC6TC6TG 
AAGGCCCGCG 
CGTGTACACC 
GTGTAOCTGG 
GCGGCCCGOG 
AACGACGAGG 
CCCAACATCC 
CCCT0G66C6 
GGC0GGC06C 
TCATGGAAAG 
TCCCCTCCCC 
TCCOGCACCG 
GGGTTOGGGC 
CCOGACOCTG 



51 
I 

CTTCACOGGT 
CCAAGGCCAA 
GGCTGCT60G 
OGGCAGTGCT 
ACAACAAGAA 
AGCTCAACAA 
AGGCCGTGCT 
GCAAGAAGGC 
CCCAGCTCCC 
AGCTGAGCCG 
TCGCCOGCCT 
CCTGCCG06T 
CTTGCGQATG 
COGGGGGGAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 



326 



wo 02/086443 

GCOGGOSGOG COGCACCTGC CCGCCTOGGC GTTCGTGACT CAGCOGCCCX ATCCOGfiCTC 900 

GCTAAGGGGC IGGGGGGAGG COGCAGOICC TTCTGGAAGA CTTCGCCrTC CGCTCTGAOG 960 

CMGG6C0GAG GTGGGCAGTC CAGGGCGAGA GCOGGOGGGC CXGAAGGXGA GTCAG GCOC T 1020 

OGGCAGCTGC AGCCGGGGTG TCTGCyTACCC CCCXX3GGGTC CrGCTTAGCC CAGGACTTTC 1080 

AGAOGGCCGC TGGCCGGGAG GCTrTGGTGG GACAGAOGCG ATaSCOGATT TOGGTCTQGC 1140 

GCCCCTTCre CGGCCGGGAC CCAGGCCTTT CACATCAGC7 CTCCCTCCAT CTTCATTCAT 1200 

AGGTCXGC3GC TOGGGCOGGG ACX5AAGCACT TGGTAACAGG CACATCTTCC TCCCGAGTGA 1260 

CTGCCTOCTA GGAGGACATT TAOGGGAOGG CAGAGGCCTG CAGTTTGGCT TCACGGCTGG 1320 

CTATGTOGAC AGCAACAGTC GTTTTGCGGA ACGCGACTGG CAGCCAGGCC TGTCXSGGCCC 1380 

CCGACGCCGC CCCATTTCCC TTCCAGCAAA CTCAACTOGG CAATCCAAGC ACCTAGATAC 1440 

CAGCACAAGT CGGTTAATCC CTGTCTCGAC TGAGCXTCOG TTGGCTTCIG AACTGGAATT 1500 

CTGCAGCTAA CCCTTCCACG ACTAGAACCT TAGGCATTGG GGAGTTTTAG ATGGACTAAT 1560 
TTTATTAAAG GATTGTTTTT TTTTT 

Seq ID NO: 379 Protein sequence 
Protein Acceseioa ft: NP_002096 

1 11 21 31 41 

I 1 ! i 1 

MSGRGKTGGK ARAKAKSRSS RAGLQFPVGR VHRLLRKGHY AE»VGAGAPV 
AEILBLAGNA ARDNXRTRIZ PRHLQLAIBN DEBUIKLLGG VTIAQGGVLP 
TSATVGPKAP SGGKKATQAS QEY 



SI 

1 

YLAAVLBYLT 60 
NIQAVLLPKX 120 



Seq ID NO: 380 DNA sequence 
Nucleic Acid Accession ft; AL136942 
Coding sequence: lB4'-664 

1 11 21 31 41 SI 

I } I i 1 I 

ACGCGTCCGG CAGAAGCTCG GAGCTCTCGG GGTATCGAGG AGGCAGGCCC GCGGGCGCAC 60 

GGGCGAGCGG GCCGGGAGCC GGAGOSGCGG AGGAGCCGGC AGCAGCGGCG CGGCGGGCTC 120 

CAGGCGAGGC GGTCGACGCT CCTGAAAACT TG0GCX3CGCG CTCGCGCCAC TGCGCCCGGA 180 

GCGATGAAGA TGGTCGCGCC CTGGAOGOQG TTCTACTCCA ACAGCTGCTG CTTGTGCTGC 240 

CATCTCCGCA CCG6CACCAT CCPGCTOGGC GTCTGGTATC TGATCATCAA TGCTGTGGTA 300 

CTCTTCATTT TATTGAGTGC CCTGGCTGAT CC3GGATCAGT ATAACTTTTC AAGTTCTGAA 360 

CTGGGAGGTG ACTTTGAGTT CATGGATGAT GCCAACATGT GCATTGCCAT T6GGATTTCT 420 

CTTCTCATGA TCCTGATATG TCCTATGGCT ACTTACGGAG CGTACAAGCA ACGCGCAGCC 480 

TGGATCATCC CATTCTTCTG TTACCAGATC TTTGACTTTG CCCTGAACAT GTTGGTTGCA 540 

ATCACTGTGC TTATTTATCC AAACTCCATT CAGGAATACA TACGGCAACT GCCTCCTAAT 600 

TTTCCCTACA GAGATGATGT CATGTCAGTG AATCCTACCT GTTTGGTCCT TATTATTCTT 660 

CTGTTTATTA GCATTATCTT GACTTTTAAG GGTTACTTGA TTA GCTGT GT TTGGAACTGC 720 

TACCX3ATACA TCAATGGTAG GAACTCCTCT GATGTCCTGG TTTATGTTAC CAGCAATGAC 780 

ACTACGGTGC TGCTACCXXC GTATGATGAT GCCACTGTGA ATGGTGCTGC CAAGGAGCCA 840 

CCGCCACCTT AOGTGTCTGC CTAAGCCTTC AAGTGGGCGG AGCTGAGGGC AGCAGCTTGA 900 

CTTTGCAGAC ATCTGAGCAA TAGTTCTGTT ATTTCACTTT TGCCATGAGC CTCTCTGAGC 960 

rmrriv r iG ctgaaatgct actttttaaa atttagatgt tagattgaaa actgtagttt 1020 

TCAACATATG CTTKSCTAGA ACACTGTGAT AGATTAACTG TAGAATTCTT CCTGTACGAT 1080 

TGGGGATATA ACGGGCTTCA CTAACCTTCC CTAGGCATTG AAACTTCCCC CAAATCTGAT 1140 

GGACCTAGAA GTCTGCTTTT GTACCTGCTG GGCCCCAAAG TTG GGCftTTT TTCTCTCTGT 1200 

TCCCrCTCTT TTGAAAATGT AAAATAAAAC CAAAAATAGA C3VACTTTTTC TTCAGCCATT 1260 

CCAGCATAGA GAACAAAACC TTATGGAAAC AGGAATGTCA ATTGTGTAAT CATTGTTCTA 1320 

ATTAGGTAAA TAGAAGTCCT TATGTATGTG TTACAAGAAT TTCCCCCACA ACATCCTTTA 1380 

TGACTGAAGT TCAATGACAG rrTGTGTTTG GTGGTAAAGG ATTTTCTCCA TGGC CTGA AT 1440 

TAAGACCATT AGAAAGCACC AGGC06T6GG AGCAGTGACC ATCTACTGAC TGTTCTTOTG 1500 

GATCTTGTGT CCAGGGACAT GGGGTGACAT GCCTCGTATG TGTTAGAGGG TGGAATGQAT 1560 

GTCTTTGGCG CTGCATGGGA TCTGGTGCCC CTCTTCTCCT GGATTCACAT CCCCACCCAG 1620 

GGCCCGCTTT TACTAAGTGT TCTGCCCTAG ATTGGTTCAA GGAGGTCATC CAACTGACTT 1680 

TATCAAGTGG AATTGGGATA TATTTGATAT ACTTCTGCCT AACAACATGG AAAAGGGTTT 1740 

TCTTTTOCXrr GCAAGCTACA TCCTACTGCT TTGAACTTCC AAGTATGTCT AGTCACCTTT 1800 

TAAAATGTAA ACATTTTCAG AAAAATGAGG ATTGCCTTCC TTGTATGCGC TTTTTACCTT 1860 

GACTACCTGA ATTGCAAGGG ATTTTTATAT ATTCATATGT TACAAAGTCA GCA ACTCTC C 1920 

TGTTGGTTCA TTATTGAATG TGCTGTAAAT TAAGTOGTTT GCAATTAAAA CAAGGTTTGC 1980 
CCACATCCAA AAAAAAAAAA AAAAA 

Seq ID NO: 381 Protein sequence 
Protein Accession #< GAB66876 

1 11 21 31 41 

) I I 1 i 

MKMVAPWTRF YSNSCCLCCH VRTGTILLGV WYLIINAWL IiIU^SALADP 
GGDFEEMDDA NMCIAIAISU WILICAMAT YGAYKQRAAW IIPFPCYQIF 
TVLIYPMSIQ EYIRQLPPNP PYRDDVNSVN PTCLVLHU. PISIILTFKG 
RYINGRNSSD VLVYVTSNDT TVLLPPYDDA TVNGAAKEPP PPYVSA 

Seq ID NO: 362 DNA sequence 
Nucleic Acid Accession NM_002510 
Coding sequence: 92-1774 

I 11 21 31 41 

t I I i 1 

CAGATGCCAG AAGAACACTG TrGCTCTTGG TGGACGGGCC CAGAGGAATT 
CCTTGAGTGC CTGOGTCCGT GAGAATTCAG CATGGAATGT CTCTACTATT 
TCTGCTCCTG GCXGCAAGAT TGCCACTTGA TGCCGCCAAA CGATTTCATG 
CAATGAAAGA CCTTCTGCTT ACATGAGGGA GCACAATCAA TTAAATGGCT 
TGAAAATGAC TGGAATGAAA AACTCTACOC AGTGTGGAAG CXK^GGAGACA 
AAACTCCTGG AACGGAGGCC GTGTGCAGGC GGTCCTGACC AGTGACTCAC 



SI 
I 

DQYNFSSSEL 60 
DFALNMLVAI 120 
YLISCVHNCY 180 



51 
1 

CAGASTTAAA 60 

TCCTGGGATT 120 

ATGTGCTGGG 180 

GGTCTTCTGA 240 

TGAGGTGGAA 300 

CASCCCrOGT 360 
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OGGCTCAAAT ATAACATTTG CGGTGAACCT GATATTCCCT 
CAATGGCAAC ATAGTCTATC AGAAGAACTG CAGAAA.TGAG 
ATATGTTEAC AACTGGACAG CATGGTCAGA OGACAGTGAC 
AAGCCATCAT AACGTCTTCC CTGATGGGAA ACCTTTTCCT 
ATGGAATTTC ATCTACGTCT TCCACACACT TGGTCAGTAT 
TTCAGTGAGA GTTTCTGTGA ACACAGCCAA TGTGACACTT 
GACTGTCTAC AGAAGACATG GACGGGCATA TGTTCCCATC 
CXSTGGXAACA GATCAGATTC CrG'IsyiTTGT GACTATGTTC 
ATCOGftOGAA ACCTTCCTCA AAGATCTCCC CATTATGTTT 
TAGCCACTTC CTCAATTATT CTACCATTAA CTACAAGTGG 
CCTGTrrGTT TCCWXAATC ATACTGTGAA TCACA CGTA T 
CCTTAACCTC ACTGTGAAAG CTGCAGCACC AGGACCTTGT 
CAGACCTTCA AAACOCAOCX: CTTCTTTAGG ACCTGCTGGT 
TAGGATTCCT GATX5AAAACT GCCAGATTAA CAGATATGGC 
AATTCTAGAG GGAATCTTAG AGGTTAACAT CATOCAG ATG 
GCCATGGCCT GAAAGCTCCC TAATAGACTT TGTCGTGACC 
GGAGGTCTGT AOCATCATTT CTGACCCCAC CTGOGAGATC 
CCCTGTGGAT GTGGATGAGA TGTGTCTGCT GACTGTGAGA 
GACGTACTGT GTGAACCTCA CCCTGGGGGA TG ACACA AGC 
GATTTCTGTT CCTGACAGAG ACCCAGCCTC GCCT TTAAGG 
CTCCGTTGGC TGCTTGGCCA TATTTGTCAC TGTCATCTCC 
CAAGGAATAC AAOCCAATAG AAAATAGTCC TGGGAATGTG 
TGTCTTTCTC AACOGTGCAA AAGCCGTGTT CTTCOOGGGA 
ACTCAAAAAC CAAGAATTTA AAGGAGTTTC TTAAATTTCG 
TTTTCAGTGC CATTGATGTG AGATGTGCTG GAGTGGCTAT 
TTATTGTTAA ATAGATATTG TCGTTTGGGG AAGTTGAATT 
TTTAGAGATG GGGAGAOGGA TTATACT6CA GGCAGCTTCA 
AAAGCAACTT AGCAAGGCTT CTTTTCATTA TTTTTTATGT 
TAACTAGTAG GATAGAAACA CTGTGTCCCG AGAGTAAGGA 
AGCCTAACCC AGGTTAACTG CAAGAAGAGG CGGGATACTT 
ATGCATAAAG CCAATGTAGT CXaVGTTrCTA AGATCATGTT 
TTCAATACAC ACTCATGAAC TCCTGATGGA ACAATAACAG 
GTGCACACTT GCTAGACTCA GAAAAAATAC TACTCTCATA 
TGACAACCTA CTTTGCTTGG CTGAGTGAAG GAATGATATT 
GGACATTTAG TTAGTCCTTT TTATATACCA GGCATGATGC 
ATTTCCAAAT TTTTGTATAG TCGCTGCACA TATTTGAAAT 
AGATGAGGTC CCTGGTTTTT CATGGCAACT TGATC AGTAA 
ACTAAAACCA TCTACTATAT GTTAGACATG ACATTCTTTT 
AAGTGTGGGA AGAGACAAAA AAAAAAAAA 

Seq ID NO: 383 Protein sequence 
Protein Accession ft : KP_002S01 



PCTAJS02/12476 



AGATGCCAAA 
GCrGCTTEAT 
GGGGA AAftJG 
CAOCAOCCXXj 
TTCCAGAAAT 
GGGCCTCAAC 
GCACAAGTGA 
CAGAAGAAOS 
GATGTCCTGA 
AGCTTOGGGG 
GTGCrCAATG 
COGCCACOGC 
GACAACCCXX 
CACTTTCAAG 
ACAGACGTCC 
T6CCAAGGGA 
ACCCAGAACA 
OGAACCTTCA 
CTGGCTCrCA 
ATGGCAAACA 
CTCTTGGTGT 
GTCAGAAGCA 
AACCAGGAAA 
ACCTTGTTTC 
TAACCTTTTT 
TTTTATAGGT 
CCCATGTTGT 
TTCACTTATA 
GAGAAGCTAC 
TCAGCTTTCC 
CCAAGCTAAC 
GCCCAAGCCT 
AATGGGTGGG 
CATATATTCA 
TGAGTGACAC 
CATATATTAA 
GGATTTCACC 
TCTCTCCTTC 



AGGAAGATGC 
CTGCTGATCC 
GCACGGGOCA 
OATGGAGAAG 
TGGGAOGATG 
TCATGGAAGT 
AAGATGTGTA 
ATCGAAATTC 
TTCATGATCC 
ATAATACTGG 
GAACCTTCAG 
CACCACCACC 
TGGAGCTGAG 
CCACCATCAC 
TGATGCCCGT 
GCATTCCCAC 
CAGTCTGCAG 
ATGGGTCTGG 
OGAGCACCCT 
GTGCCCTGAT 
ACAAAAAACA 
AAGGCCTGAG 
AGGATCCGCT 
TGAACCTCAC 
TTCCTAAAGA 
TAAATGTCAT 
GAAACTGATA 
AAGTCTTAGG 
TATTGATTAG 
ATGTAACTGT 
TGAATCCXIAC 
GTGGTATGAT 
AGTATTTTGG 
TTTATTCCAT 
TCTTGTGTAT 
GACTTTCGAA 
TCTGTTTGTA 
CTGAAAAATA 



MECLYYFLGF 



RNEAGLSADP 
GQYFQKUSRC 
TMFQKNDRNS 
HTYVLNGTFS 
RYC^FQATIT 
CEITC2NTVCS 
PLSHANSALI 
FPOIQEKDPL 



11 
I 

LLLAARLPLD 
NSWKGGRVQA 
YVYNWTAWSE 
SVRVSVNTAN 
SDETFLKDLP 
LNLTVKAAAP 
IVEGXLEVNZ 
PVDVOEHCUi 
SVGCLAIPVT 
LKNQEPKGVS 



21 
I 

AAKRFHDVLG 
VLTSDSPALV 
OSDGENGTGQ 
VTLGPQLMEV 
IMFDVLIHDP 
GPCPPPPPPP 
IQ>frDVU4PV 
TVRRTFNGSG 
VISI.LVVKKH 



31 

1 

NERPSAYMRE 
GSNITFAVNL 
SHKHVFPDGK 
TVYRRHGHAY 
SHFLNYSTIN 
RPSKPTPSLG 
PWPESSLIDF 
TYCVNLTLGD 
KEYHPIENSP 



41 

1 

RNQLNGWSSD 
IFPRCQKEDA 
PPPUHPGWRR 
VPIAQVKDVY 
YKWSFGDITTG 
PAGDNPLELS 
WTCQGSIPT 
DTSLALTSTL 
GHWRSKGLS 



51 

I 

ENDWNEKLYP 
NGNIVYEKNC 
WNFIYVFHTL 
WTDQIPVFV 
LFVSTNHTVN 
RIPDENCQIN 
BVCTIISDPT 
ISVPDRDPAS 
VFLNRAKAVF 



Seq ID NO: 384 DNA sequence 
Nucleic Acid Accession 8: NM_00ll34 
Coding sequence: 48-1877 



TCCATATTGT 
AATCAATTTT 
AATATGGAAT 
ACCTGGCTAC 
AAATGGTGAA 
GGTGTTTAGA 
TGGAGAAGTA 
TTCTTGCACA 
TCACAAGCTG 
AGATAGCAAG 
ATGACAAAAT 
AGGCAGCAAC 
CAGTAATGAA 
AGAAGTTTAC 
TACATGAGCA 
TGTCCTACAT 
TGACCACGCT 
GTCTATCTCC 
GGGAAAAAAA 
TTGCTGTCTC 
TCCAGACTGA 
TCCAGGAGAG 
ATTACTTACA 
CGGAGCTGAT 
GTGAGGACAA 



11 

! 

GCTTCCACCA 
TTTAATTTTC 
AGCTTCCATA 
CATATTTTTT 
AGATGCATTG 
AAACCAGCTA 
CGGACATTCA 
CAAAAAGCCC 
TGAAGCATAT 
AAGGCATCCC 
AATTCCATCT 
AGTTACAAAA 
AAATTTTGGG 
CAAAGTTAAT 
CTGTTGCAGA 
ATGTTCTCAA 
GGAAC6TGGT 
AAATCTAAAC 
TATCTTCTTG 
AGTAATTCTA 
AAACCCTCTT 
CCAAGCATTG 
AAATGCGTTT 
GGGCATCACC 
ACTATTGGCX: 



21 
I 

CTGCCAATAA 
CTACTAAATT 
TTGGATTCTT 
GCCCAGTTTG 
ACTGCAATTG 
CCTGCCTTTC 
GACTGCTGCA 
ACTCCAGCAT 
GAAGAAGACA 
TTCCTGTATG 
TGCTGCAAAG 
GAATTAAGAG 
ACCCGAACTT 
TTTACTGAAA 
GGAGATGTGC 
CAAGACACTC 
CAATGTATAA 
AGGTTTTTAG 
GCAAGTTTTG 
AGAGTTGCTA 
GAATGCCAAG 
GCAAAGCGAA 
CTCGTTGCTT 
AGAAAAATGG 
TGTGGCGACG 



31 
I 

CAAAATAACT 
TTACTGAATC 
ACCAATGTAC 
TTCAAGAAGC 
AGAAACCCAC 
TGGAAGAACT 
GCCAAAGTGA 
OGATCCCACT 
GGGAGACATT 
CACCTACAAT 
CTGAAAA3GC 
AAAGCAGCTT 
TCCAAGCCAT 
TCCAGAAACT 
TGGATTGTCT 
TGTCAAACAA 
TTCATGCAGA 

gagatagaga 
ttcatgaata 
aaggatacx:a 
ataaaggaga 
gctgcggcct 
acacaaagaa 
cagccacagc 
gagoggctga 



41 

I 

AGCAACCATG 
CAGAACACTG 
TGCAGAGATA 
CACTTACAAG 
TGGAGATGAA 
TTGCCATGAG 
AGAGGGAAGA 
TTTCCAAGTT 
CATGAACAAA 
TCTTCTTTGG 
AGTTGAATGC 
GTTAAATCAA 
AACTGTTACT 
AGTCCTGGAT 
GCAGGATGGG 
AATAACAGAA 
AAATGATGAA 
TTTTAACCAA 
TTCAAGAAGA 
GGAGTTATTG 
AGAAGAATTA 
CTTCCAGAAA 
AGCCC CCCAG 
AGCCACTTGT 
CATTATTATC 



420 
480 
540 
600 

660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1300 
1440 
ISOO 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 



60 
120 
180 
240 
300 
360 
420 
480 
540 



51 
1 

AAGTGOGTGG 
CATAGAAATG 
AGTTTAGCTG 
GAAGTAAGCA 
CAGTCTTCAG 
AAAGAAATTT 
CATAACTGTT 
CCAGAACCTG 
TTCATTTATG 
GCTGCTOGCT 
TTCCAAACAA 
CATGCATGTG 
AAACTGAGTC 
GTGGGOCATG 
GAAAAAATCA 
TGCTGCAAAG 
AAACCTGAAG 
TTTTCTTCAG 
CATCCTCAGC 
GAGAAGTGTT 
CAGAAATACA 
CTAGGAGAAT 
CTGACCTCGT 
TGCCAACTCA 
GGACACTTAT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
X500 
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GTATGAGACA TGAAATGACT CCAGTAAAOC C1U a xat X3 G CCAGTGCTGC ACTTCTTCftT 1560 

ATGOCAACAG GAGGCCATGC TTCAGCACCT TGGTGGTGGA TGAAACATAT GTCOCTCCPG 1620 

CATTCTCTGA TGACAAGTTC ATTTTCCATA AGGATCTGTG CCAAGCTCAG GGTGTAGOGC 1680 

TGCAAAOGAT GAAGCAAGAG TTTCTCATTA ACCTTGTGAA GCAAAAGOCA CAAATAACAG 1740 

AGGAACAACT TCAGGCTGTC ATTGCAGATT TCTCACGOCT GTTGGAGAAA TGCTGCCAAG 1800 

GGCAG6AACA GQAAGTCTGC TTEGCTGAAG AGGGACAAAA ACTGATTTCA AAAACTCtSTG 1860 

Clt iC mUGG AGTTTAAATT ACTTCAGGGG AAGA6AA6AC AAAAOGAGTC TTTCATTGGG 1920 

TQTGAACTTT TCTCTTTAAT TTTAACTGAT TTAACACTTT 3TGTGAATTA ATGAAATGAT 1960 
AAAGACTTTT ATGTGAGATT TCCTTATCAC AGAAATAAAA TATCTCCAAA TG 

Seq ID HO: 385 Protein sequence 
Protein Accession ft: KP_001125 

1 11 21 31 41 51 

\ \ \ \ \ \ 

MKWVESIFLI PLLNFTESRT LHRNEYGIAS IU3SYQCTAE ISLADLATIF FAQFVQEATY 60 

KEVSKKVKDA LTAIBKPTGD' EQSSGCLBIQ LPAFLEELCH EKEILEKYCT SDCCSQSEEG 120 

RHNCFLAHIOC PTPASIPIiFQ VPEPVTSCEA YEEDRETFMN KFIYBIARSU PFLYAPTILI* 180 

WAARYDKIIP SCCKAEMAVE CFXJTKAATVT KELRESSUiIf QHACAVNXNF GTOTFIQAITV 240 

TKLSQKFTKV NFTEIQKLVL DVAHVHEHCC RGDVLDCLQD 6EKXKSYICS QQDTLSNKIT 300 

ECCKLTTLER GQCIIHAEHD EKPEGLSPNL NRFIiGDRDFN QFSSGEKNIF LASFVHEYSR 360 

RHPQIiAVSVI LRVAKGYQEL LEKCF0TE2IP LECQDKGEEE LQKYIQESQA LAKRSCGIiFQ 420 

KLGEYYLQNA FLVAYTKKAP QLTSSELMAI TRKMAATAAT CCQLSEDKLL AG6EGAADII 480 

IGHXiCIRKEM TPVNPGVGQC CTSSYANRRP CFSSLWDET YVPPAFSDDK PIFKKDLCQA 540 

QGVALOTMKQ EPLINLVKQK PQITEEQLEA VIADPSGLLE KCCQGQEQEV CFASBOQKLI 600 
SKTRAALGV 

Seq ID NO: 386 ONA sequence 

Nucleic Acid Accession ft: HM_002205.1 

Coding sequence : 1 . . 3149 

I 11 21 31 41 51 

I I . I I I 1 

ATGGGGAGCC GGACGCCAGA GTCCCCTCTC CACGCCGTGC AGOTGCGCTG GGGCCCCCGG 60 

CGCCGACCCC CGCTSSTGCC GCTGCTGTrG CTGCTSSTGC CGCCGCCACC CAGGGTCGGG 120 

GGCTTCAACT TAGACGCX36A GGCCCCAGCA GTACTCTCGG GGCCCCCGGG CTCCTTCTTC 180 

GGATTCTCAG' TGGAGTTTTA OOGGCOGGGA ACAGACGGGG TCAGTGTGCT GGTGGGAGCA 240 

CCCAAGGCTA ATACCAGCCA GCCAQGAGTG CTGCAGGGTG GTGCTGTCTA CCTCTGTCCT 300 

TGGGGTGCCA GCCCCACACA GTGCAOCCCC ATTGAATTTG ACAGCAAAGG CTCTOGGCTC 360 

CTGGAGTCCT CACTGTCCAG CTCAGAGGGA GAGGAGCCTG TGGAGTACAA GTCCTTGCA6 420 

TGGTTCGGGG CAACAGTTCG AGCCCATGGC TCCTCCATCT TGGCATGOGC TCCACTGTAC 480 

AGCTGGCGCA CAGAGAAGGA GCCACTGAGC GACCCCX3TGG GCACCTGCTA CCTCTCCACA 540 

GATAACTTCA CCCGAATTCT GGAGTATGCA CCCTGCCGCT CAGATTTCAG CTGGGCAGCA 600 

GGACAGGGTT ACTGCCAAGG AGGCTTCAGT GCCGAGTTCA CCAAGACTGG CCGTGTGGTT 660 

TTAGGTGGAC CAGGAAGCTA TTTCTGGCAA GGCCAGATCC TGTCTGGCAC TCAGGAGCAG 720 

ATTGCAGAAT CTTATTACCC CX3AGTACCTG ATCAACCTGG TTCAGGGGCA GCTGCAGACT 780 

CGCCAGGCCA GTTCCATCTA TGATGACAGC TACCTAGGAT ACTCTGTGGC TGTTGGTGAA 840 

TTCAGTGGTG ATGACACAGA AGACTTTGTT GCTGGTGTGC CCAAAGGGAA CCTCACTTAC 900 

GGCTATGTCA CCATCCTTAA TGGCTCAGAC ATTCGATCCC TCTACAACTT CTCAGGGGAA 960 

CAGATGGCCT CCrACTTTGG CTATGCAGTG GCCGCCACAG ACGTCAATGG GGACGGGCTG 1020 

GATGACTTGC TGGTGGGGGC ACCCCTGCTC ATGGATCGGA CCCCTGAOGG GCGGCCTCAG 1080 

GAGGTGGGCA GGGTCTACGT CTACCTGCAG CACCCAGCC6 GCATAGA6CC CACGCCCACC 1140 

CTTACCCTCA CTGGCCATGA TGAGTTTGGC GGATTTGGCA GCTCCTT6AC GCCCCTOGGG 1200 

GACCTGGACC AGGATGGCTA CAATGAT6TG GCCATCGGGG CTCCCTTTGG TGGGGAGACC 1260 

CAGCAGGGAG TAGTGTTTGT ATTTCCTGGG GGCCCAGGAG GGCTGGGCTC TAAGCCTTCC 1320 

CAGGTTCTGC AGCCCCTGTG GGCAGCCAGC CACACCCCflG ACTTCTTTGG CTCTGCCCTT 1380 

CGAGGAGGCC GAGACCTGGA TGGCAATGGA TATCCTGATC TGATTGTGGG GTCCTTTGGT 1440 

GTGGACAAGG CTGTGGTATA CAGGGGCCGC CCCATOGTGT CX:GpTAGTGC CTCCCTCACC 1500 

ATCTTCCCCG CCATGTTCAA CCCAGAGGAG CGGAGCTGCA GCTTAGAGGG GAACCCTGTG 1560 

GCCTGCATCA ACCTTAGCTT CTGCCTCAAT GCTTCTGGAA AACACGTTGC TGACTCCATT 1620 

GGTTTCACAG TGGAACTTCA GCTGGACTGG CAGAAGCAGA AGGGAGGGGT ACGGCGGGCA 1680 

CTGTTCCTGG CCTCCAGGCA GGCAACCCTG ACCCAGACXX: TGCTCATCCA GAATGGGGCT 1740 

CGAGAGGATT GCAGAGAGAT GAAGATCTAC CTCAGGAACG AGTCAGAATT TOGAGACAAA 1800 

CPCTCGCCGA TTCACATOGC TCTCAACTTC TCCTTGGACX: CCCAAGCXXX: AGTGGACAGC 1860 

CACGGCCTCA GGCCAGCCCT ACATTATCAG AGCAAGAGCC GGATAQAGGA CAAGGC TCAG 1920 

ATCTTGCTGG ACTGTGGAGA AGACAACATC TGTGTGCCTG ACCTGCaGCT 6GAAGTGTTT 1980 

GGGGAGCAGA ACCATGTGTA CCTGGGTGAC AAGAATGOCC TGAAOCTCAC TTTCCATGCC 2040 

CAGAATGTGG GTGAGGGTGG CGCCTATGAG GCTGAGCTTC GGGTCACCGC CCCTCCAGAG 2100 

GCTGAGTACT CAGGACTCGT CAGACACCCA GGGAACTTCT CCAGCCTGAG CTGTGACTAC 2160 

TTTGCCGTGA ACCAGAGCCG OCTGCTGGTG TOTGACXTTCG GCAAOCCCAT GAAGGCAGGA 2220 

GCCAGTCTGT GGGGTGGOCT TOSGTTTACA GTCCCTCATC TCGGG6ACAC TAAGAAAACC 2280 

ATCCAGTTTG ACTTCCAGAT CCTCAGCAAG AATCTCAACA ACTCGCAAAG CGACGTGGTT 2340 

TCCTTTCGGC TCTCCGTGGA GGCTCAGGCC CAGGTCACCC TGAACGGTGT CTCCAAGCCT 2400 

GAGGCAGTGC TATTCCCAGT AAGCX3ACTGG CATCCCCGAG ACCAGCCTCA GAAGGAGGAG 2460 

GACCTGGGAC CTGCTGTCCA CCATGTCTAT GAGCTCATCA ACCAAGGCCC CAGCTCCATT 2520 

AGCCAGGGTG TGCTGGAACT CAGCTGTCCC CAGGCTCTGG AAGGTCAGCA GCTCCTATAT 2580 

GTGACCAGAG TTACGGGACT CAACTGCACC ACCAATCACC CCATTAACCC AAAGGGCCTG 2640 

GAGTTGGATC COGAGGGTTC CCTGCACCAC CAGCAAAAAC GGGAAGCTCX: AAGCCGCAGC 2700 

TCTGCTTCCT CX»GGACCTCA GATCCTGAAA TGCCOGGAGG CTGAGTGTTT CAGGCTGGGC 2760 

TGTGAGCTCG GGCCCCTGCA CCAACAAGAG AGCCAAAGTC TGCAGTTGCA TTTCOGAGTC 2820 

TGGGCCAAGA CTTTCTTGCA GCGGGAGCAC CAGCCATTTA GCCTGCAGTG TGAGGCTGTG 2880 

TACAAAGCCC TGAAGATGCC CTACCGAATC CTGCCTCGGC AGCTGCCCCA AAAAGAGCGT 2940 

CAGGTGGCCA CAGCTGTGCA ATGGACCAAG GCAGAAGGCA GCTATGGOTr CCCACTGTGG 3000 

ATCATCATCC TAGCCATCCT GrTTGGCCTC CTGCTCCTAG GTCTACTCAT CTACATCCTC 3060 

TACAAGCTTG GATTCTTCAA ACGCTCCCTC CCATATG6CA COGCCATGGA AAAAGCTCAG 3120 
CTCAAGCCTC CAGCCACCTC T6ATGCCT6A 



329 



wo 02/086443 

Seq 10 KO: 387 Protein sequence 
Prosein Accession 6: IIP_002196.1 



PCTAJS02/12476 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



KGSRTPSSPL 
GFSVEFYRPG 
LESSLSSSEG 
DNFTRILEYA 
lAESYYPBYL 
GYVTILNGSD 
EVGHVYVYLQ 
QQGWFVFPG 
VDKAWYRGR 
GFTVELQIiDW 
L5PIHIALNF 
GBQJJHVYLCT 
FAVNQSRLLV 
SFRLSVEAQA 
SQGVLSLSCP 
SASSGPQILK 
YKALKMPYSI 
YKLGFFRRSL 



11* 
I 

HAVQLRWGP3 
TDGVSVLVGA 
EBPVEYKSLQ 
PCRSOPSHAA 
'ZNliVQGQLQT 
IRSIiYNPSGB 
HPAGIEPTPT 
GPGGLGSKPS 
PIVSASASLT 
QKOKGGVRRA 
SLDPQAFVDS 
KNAUII.TFHA 
CDLOTPMKAG 
QVTLKGVSKP 
QALEGQQUjY 
CPBAECFRLH 
LPRQLPQKER 
PYGTAMEKAQ 



21 
I 

RRPPLXiPXiLL 
PKANTSQPGV 
WFGATVRAHG 
GQGYCQGGFS 
ROASSIYDDS 
QMASYFGYAV 
LTLTGHDEFG 
QVLQPLHAAS 
IFPAMFNPEE 
LPLASRQATL 
HGLRPALHYQ 
QNVGEGGAYB 
ASLWGGLRFT 
EAVLFPVSDW 
VTRVTGUTCT 
CELGPLKQQE 
QVATAVQWTK 
LKPPATSOA 



31 
I 

LLLPPPPSVG 
LQGGAVYLCP 
SSILACAPLY 
AEFTKTGHW 
YLGYSVAVGB 
AATDVNGDGXi 
RFG5SLTPLG 
HTPDFFGSAL 
RSCSLEGSPV 
TQTLLIONGA 
SKSRIEDKAQ 
AELRVTAPPE 
VPHLRDTXKT 
UPRDQPQKEE 
TNHPINPKGL 
SQSliQl^FRV 
AEG5YGVPLW 



41 

I 

GPtlLOASAPA 
KGASPTQCTP 
SWRTEKEPI.S 
LGGPGSYFWQ 
FSGOSTEDFV 
DDLLVGAPLL 
DLDQDGY»DV 
RGGROLDGMG 
ACINLSFCLN 
REDCREMKIY 
ILLDC6EDNI 
AEYSGLVRHP 
I0FDFQXL5K 
DLGPAVKHVy 
ELDPEGSLHH 
WAKTFLQREH 
XIILAII.FGL 



51 
I 

VLSGPPGSFF 
lEFDSKGSRL 
DPVGTCYLST 
GQZLSATQBQ 
AGVPXGKLTY 
KDRTPDGRPQ 

aigapfgget 
ypolivgsfg 
asgkhvadsi 
lrnesefrdk 
cvpdlqlevf. 
gnfsslscdy 
nlnnsqsdw 
elznqgpssi 
qqkreapsrs 
qpfslqceav 
lllglLiyzl 



Seq* ID NO: 388 DNA sequence 

Nucleic Acid Accession 8: NM_00242S 

Coding sequence: 26.. 1453 



11 
I 

AGGGCAGTGA 
GCCTATCCTC 
TACCTAGAAA 
AATCTCATTG 
AAGCTAGACA 
GGTCACTTCA 
ATTGTGAATT 
CTGAAAGTCT 
GATATAATGA 
GGACACAGTT 
GATGATGATG 
CATGAACTTG 
CCACTCTACA 
GGCATTCAGT 
AAATCTGTTC 
GCCATCAGCA 

tcxx:actgga 

TCATATTTGG 
AATGAGTTCT 
AjCCCTGGGTT 
AAGAAAACAT 
ATGGAGCAAG 
GATGCTGTAT 
TTTGACCCCA 
TAGGCGAGAT 
taatgtatta 
AGGCTTGCAG 
GAATTGCMrr 
ATAGATGTGT 



AAAGAAGGTA 
AGTCTGCTCT 
TGCCCAGCAA 
AAAGGACAGT 
GGTGACAGGG 
TCCTGACGTT 
TACATACAGG 
TGAGAAAGCT 
AGGAGAGGCT 
TGATG6CCCA 
TATTCACTTT 
CGTTGCTGCT 
TTTGATGTAC 
TGATGTGAAT 
GGTGCCCACA 
GTCCTTCGAT 
TTGGCGAAGA 
CTCTCTTCCA 
TTTTAAAGGA 
AGGCATCCAT 
CAAGGAAAAG 
TAGCXIAGTCC 
GCCTAAGGTT 
ACAGTTTGAG 
GTTACATTGC 
ATTATTCATC 
GAAGAAGATG 
ACTTGCTTTT 
ATGTATTTTC 
CTT 



Seq ID HO: 389 Protein sequence 
Protein Accession It: NP_002416 



21 
I 

GAATGATGCA 
TGAGTGGGGC 
AGTACTACAA 
TTAAAAAAAT 
CTGACACrCT 
GCTCCTTTCC 
ATACACCAGA 
GGGAAGAGGT 
TCTCTTTOGC 
TGGCTCATGC 
AAAAATGGAC 
GCCACTCCCT 
ACTCATTCAC 
CTCTCTAOGG 
CTTOGGGATC 
CTCTGAGGGG 
ACCCTGAACC 
ATGCTGCATA 
GGGCCATCAG 
TTCCTCCAAC 
ACTTCTTTGC 
GCTTCCCTAG 
TACAGGCATT 
ATGCCAGGAT 
AGGGGGAAGA 
TGAGCCAAAA 
ATATCTGCAT 
GAACAGAATT 
TATTACTTCC 



31 
I 

TCTTGCATTC 
AGCAAAAGA6 
CCTCGAAAAG 
CCAAGGAATG 
GGAGGTGATG 
TGGCATGCC6 
TTTGCCAAGA 
GACTCCACTC 
AGTTAAAGAA 
CTACCCACCT 
AGAAGATGCA 
GGGGCTCTTT 
AGAGCTOGCC 
ACCTCOCOCT 
TGAGATGCCA 
AGAATATCTG 
TGAATTTCAT 
TGAAGTTAAC 
AGGAAATGAG 
CATAA66AAA 
AGCGGACAAA 
ACTAATAGCT 
TGGATTTTTC 
GGTGACACAC 
CAGATATGGG 
TGGTTAATTT 
GTGTCATGAA 
AAGAAATACT 
TCAATAAAAA 



41 

I 

CTTGTGCTGT 
GAGGACTCCA 
GA7GTGAAAC 
CAGAAGTTCC 
CGCAAGCCCA 
AAGTGGAG6A 
GATGCTGTTG 
ACATTCTCCA 
CATGGAGACT 
GGACCTGGGC 
TCAGGCACCA 
CACrCAGCCA 
CAGTTCCGCC 
GCCTCTACTG 
GCCAAGTGTG 
TTCTTTAAAG 
TTGATTTCTG 
AGCAGGGACA 
GTAGAAGCAG 
ATTGATGCAG 
TACTGGAGAT 
GATGACTTTC 
TACTTCTTCA 
ATATTAAAGA 
TGTTTTTAAT 
TTCCTGCATG 
QAATGTTTCT 
CAltjTGCAAT 
GTTTTATTTT 



51 

1 

TGTGTCTGCC 
ACAAGGATCT 
AGTTTAGAAG 

ttgggttgga 
ggtgtggagt 
aaaocx:aoct 

ATTCTGOCAT 
GGCTGTATGA 
TTTACTCTTT 
TTTATGGAGA 
ATTTATTCSTT 
ACACTGAAGC 
TTTCGCAAGA 
AGGAAC OCCT 
ATCCTGCTTT 
ACAGATATTT 
CATTTTGGCC 
CCGTTTTTAT 
GTTATCCAAG 
CTGTTTCTGA 
TTGATGAAAA 
CAGGAGTTGA 
GTGGATCATC 
GTAACAGCTG 
AAATCTAATA 
TTCTGTGACT 
GGAATTCTTC 
AGGTGAGAGA 
GGGCCTGTTC 



I 
1 

MHLAFLVLLC 
KIQGMQKFLG 
PDLPRDAVDS 
HAYPPGPGLY 
FTELAQFRLS 
RGEYLFFKDR 
IRGNEVQAGY 
PRLIADDFPG 



11 
I 

LPVCSAYPLS 
LEVTGKLDTD 
AIEKALKVWE 
GDIHFDEDEK 
QDDVNGIQSL 
YFWRRSHWNP 
PRGIHTLGFP 
VEPKVDAVLQ 



21 
I 

6AAKEEDSKK 
TLEVMRKPRC 
EVTPLTFSRL 
WTEDASGTNL 
YGPPPASTE6 
EPEFHLISAF 
PTIRKIDAAV 
AFGPFYFFSG 



31 
t 

DLAOQYLEKY 
GVPDVGHFSS 
YEGEADIMIS 
FLVAAHEL6H 
PLVPTKSVPS 
WPSLPSYLOA 
SDKEKKKTYF 
SSQFBFDPNA 



41 
I 

VKLEKDVKQF 
FPGMPKWRIcr 
FAVXEHGDFY 
SLGLFHSAMT 
GSEMPAKCDP 
AYEVNSRDTV 
FAADKYWRFD 
RMVTHILKSN 



51 
I 

RRKDSNtilVK 

HLTYRIVNYT 

SFDGFGHSLA ' 

EALKYPIiYNS 

ALSFDAISTL 

PIPKGNEFWA 

EMSQSMBQGF 

SWLHC 



Seq ID NO; 390 VttA, sequence 

Nucleic Acid Accession ft: l]M_002421.2 

Coding sequence : 1 . . 1409 

1 11 21 31 41 51 

i I 1 i I i 

ATGCACAGCT TTCCTCCACT GCTGCTGCTG CTGTTCTGGG GTGTG6TGTC ACACAGCTTC 
CCAGOGACTC TAGAAACACA AGAGCAAGAT GTGGACTTAG TCCAGAAATA CCTGGAAAAA 
TACTACAACC TGAAGAATGA TGGGAGGCAA GTTGAAAAGC GGAOAAATAO TOGCOCAGTG 
GTTGAAAAAT TGAAGCAAAT GCAGGAATTC I 'TT GGG CTGA AAGTGACTGG GAAACCAGAT 
GCTGAAACCC TGAAGGTGAT GAAGCAGCCC AGATGTGGAG TGCCTGATGT GGCTCAGTTT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 



60 
120 
180 
240 
300 
360 
420 



60 
120 
180 
240 
300 



330 



10 
15 
20 
25 
30 
35 
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50 
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75 
80 
85 



WO 02/086443 

GTCCrCACTG AGGGGAACCC TCGCTGGGAG CAAACACATC 
TACAOGCCAG ATTTGCCAAG AGCAGATGTG GACCATGCX3V 
TGGAGTAAT6 TCACAOCTCT GACATTCftOC AAGGTCTCTG 
ATATCTTTT6 TCAGQGGACA TOVTOGGGAC AACTCTCCTT 
CTTGCTCATG CTTTTCAACC AGGCCCAGGT ATTGGAGGGG 
GAAACGTGGA CCAACAATTT CAGAGAGTAC AACTTACATC 
GGCC A TTCTC TTGGACTCTC CCATTCTACT GATATOGGGG 
ACCrrCftGTG GTGATGTTCA GCTAGCTCAG GATC3ACATTG 
GGAOGTTCCX; AAAATCCTGT CCAGCCCATC GGCCCACAAA 
AAGCTAACCr TTGATGCTAT AACTAOGATT OGG GGAG AAG 
TTCTACATCC GCACAAATCC CTTCTACCOG GAAGTIGftGC 
TGGCCACAAC TGCCAAATCG GCTTGAAGCT GCTTA OGAAT 

CGGTrrrrcA aagggaataa gtactgggct gttcagggac 

CCCAAGGACA TCTACAGCTC CTTTGGCTTC CCTAGAACT G 
C T TTCTGAGG AAAACACTGG AAAAACCTAC TTCrTTGTTG 
6ATCAATATA AAOGATCTAT GGAT CCftGG T TATCOCAAAA 
GGAATTGGOC ACAAAGTTGA TGCAGTTTTC ATGAAAGATG 
QGAACSttfiAC AATACAAATT TGATCCTAAA AOGAAGAGAA 
AATAGCTGGT TCAACTGCAG GAAAAATTAG 

Seq lO NOi 391 Protein sequence 
Protein Accession #s NP_002412.1 



PCTAJS02/12476 



MMSPPPLLUi 
VEKLKQKQEF 
YTPDLPRADV 
XiAHAFQPGFG 
TFSGDVQLAQ 
FYMRTNPPYP 
PKDIYSSFGP 
GIGHKVDAVP 



11 

1 

LFWGWSHSF 
FGLKV/TGKPD 
DHAIEKAFQL 
IGGDAHFDED 
DOIDGIQAIY 
EVBLNPISVF 
PRTVKHIDAA 
MKDGFFYFFK 



21 
I 

PATLETQEQD 
AETIiKVMKQP 
WSNVTPLTFT 

GRSQNPVQPI 
WPQLPNGI£A 
LSEENTGKTY 
GTRQYKFDPK 



31 
I 

VDLVQKYLEK 
RCGVPDVAQF 
KVSBX3QADIM 
HLHRVAAHEL 
6PQTPKACDS 
AYEFADROSV 
FPVANKYWRY 
TKRILTLQKA 



TGACCTACAG 
TTGAGAAAGC 
AGGGTCAAGC 
TTGATGGACC 
ATGCTCATTT 
GTGTTGOGGC 
CTTTGATGTA 
ATGGCATCCA 

TGATGTTCTT 
TCAATTTCAT 
TTGCCGACAG 
AGAATGTGCT 
TGAAGCATAT 
CTAAOUVATA 
TGATAGCACA 
GATTTTTCTA 
TTTTGACTCT 



41 

I 

YYNLKNDGRQ 
VLTEGNPRWE 
ISFVRGDHRO 
GHSLGLSHST 
KLTFDAITTI 
RFFKGtnCYHA 
DEYKRSMDPG 
NSWFNCRKN 



Seq ID NO: 392 DNA sequence 

Nucleic Acid Accession #: KM_002421.2 

Coding sequence: 1 . . 1409 



AraCACAGCT 
GCAGCGACTC 
TACTACAACC 
GTTGAAAAAT 
GCTGAAACCC 
GTCCTCACTG 
TACACGCCAG 
TGGAGTAATG 
ATATCTTTTG 
CTTGCTCATG 
GAAAGGTGGA 
GGCCATTCTC 
ACCTTCAGTG 
GGACGTTCCC 
AAGCTAACCT 
TTCTACATGC 
TGGCCACAAC 
CGGTTTTTCA 
CCCAAGGACA 
CTTTCTGAGG 
GATGAATATA 
GGAATTGGCC 
GGAACAAGAC 
AATAGCTGGT 



11 
1 

TTCCTCCACT 
TAGAAACACA 
TGAAGAATGA 
TGAAGCAAAT 
TGAAGGTGAT 
AGGGGAACCC 
ATTTGCCAAG 
TCACACCTCT 
TCAGGGGAGA 
CTTTTCAACC 
CCAACAATTT 
TTGGACTCTC 
GTGATGTTCA 
AAAATCCTGT 
TTGATGCTAT 
GCACAAATCC 
TGCCAAATGG 
AAGGGAATAA 
TCTACAGCTC 
AAAACACTGG 
AACXyVTCTAT 
ACAAAGTTGA 
AATACAAATT 
TCAACTGCAG 



21 
I 

GCTGCTGCTG 
AGAGCAA6AT 
TGGGAGGCAA 
GCAGGAATTC 
GAAGCAGCCC 
TCGCTGGGAG 
AGCAGATGTG 
GACATTCACC 
TCATOGGGAC 
AGGCCCAGGT 
CAGAGAGTAC 
CCATTCTACT 
GCTAGCTCAG 
CCAGCCCATC 
AACTACGATT 
CTTCTACCCG 
GCTTGAAGCT 
GTACTGGGCT 
CTTTGGCTTC 
AAAAACCTAC 
GGATCCAGGT 
TGCAGTTTTC 
TGATCCTAAA 
GAAAAATTAG 



31 

I 

CTGTTCTGGG 
GT06ACTTAG 
GTTGAAAAGC 
TTTGGGCTGA 
AGATGTGGAG 
CAAACACATC 
GACCATGCCA 
AAGGTCTCTG 
AACTCTCCTT 
ATTGGAGGGG 
AACTTACATC 
GATATOGGGG 
GATGACATTG 
GGCCCACAAA 
CGGGGAGAAG 
GAA6TTGAGC 
GCTTAOGAAT 
GTTCAGGGAC 
CCTAGAACTG 
TTCTTTGTTG 
TATCCCAAAA 
ATGAAAGATG 
AOGAAGAGAA 



Seq 10 NO: 393 Protein sequence 
Protein Accession ft: NP_0024l2.l 



MHSFPPLLLL 
VEiCLKQMQEP 
YTPDLPRADV 
LAHAFQPGPG 
TFSGDVQLAQ 
FYMRTNPPYP 
PXDIYSSFGF 
GIGHKVDAVP 



11 

I 

LFWGWSHSF 
PGLKVTGKPD 
DHAIEKAFQL 
IGGOAHFDEO 
DDIOGXQAIY 
EVELNFI5VF 
PRTVKHIDAA 
MKDGFPYPPH 



21 
I 

PATLETQE3QD 
AETLKVMKQP 
WSNVTPLTFT 
ERHTNNFREY 
GRSQNPVQPI 
WPQLPNGLEA 
LSEENTGKTY 
GTRQYKFDPK 



31 
I 

VDLVQKYLEK 
RCGVPDVAQF 
KVSEGQADIM 
NLHRVAAHAL 
GPQTPKACD5 
AYEFADRDEV 
PPVANKYWRY. 
TKRILTLQKA 



41 
I 

GTGTGGTGTC 
TCCAGAAATA 
GGAGAAATAG 
AAGTGACTGG 
TGCCTGATGT 
TGACCTACAG 
TTGAGAAAGC 
AGGGTCAAGC 
TTGATGGACC 
ATGCTCATTT 
GTGTTGCGGC 
CTTTGATGTA 
ATGGCATCCA 
CCCCAAAAGC 
TGATGTTCTT 
TCAATTTCAT 
TTGCCGACAG 
AGAATGTGCT 
TGAAGCATAT 
CTAACAAATA 
TGATAGCACA 
GATTTTTCTA 
TTTTGACTCT 



41 
1 

YYNLKNDGRQ 
VLTEGNPRWB 
ISFVRGDHRD 
6HSLGLSHST 
KIiTFDAZTTX 
RFFKGNKYWA 
OEYKRSMDPG 
NSWFNCRKN 



GATTGAAAAT 
CTTCCAACTC 
AGACATCATG 
TGGAGGAAAT 
TGATGAAGAT 
TCATGAACTC 
CCCTAGCTAC 
AGCCATATAT 
ATGTGACAGT 
TAAAGACAGA 
TTCTGTTTTC 
AGATGAAGTC 
ACACGGATAC 
CGATGCTGCT 
CTGGAGGTAT 
TGACTTTCCT 
TTTCTTTCAT 
CCAGAAAGCT 



51 
I 

VEKRRNSGPV 
QTHLTYRIEN 
NSPPDGPGGN 
DIGALMYPSY 
RGEVMPFKDR 
VQGQNVLHGY 
YPWaAHDPP 



51 

I 

ACACAGCTTC 
CCTGGAAAAA 
TGGCCCAGTG 
GAAACCAGAT 
G6CTCAGTTT 
GATTGAAAAT 
CTTCCAACTC 
AGACATCATG 
TGGAGGAAAT 
TGATGAAGAT 
TCATGOCCTC 
CCCTAGCTAC 
AGCCATATAT 
ATGTGACAGT 
TAAAGACAGA 
TTCTGTTTTC 
AGATGAAGTC 
ACACGGATAC 
CGATGCTGCT 
CTGGAGGTAT 
TGACTTTCCT 
TTTCTTTCAT 
CCAGAAAGCT 



51 
I 

VEKRRNSGPV 
QTHLTYRIEN 
NSPPDGPGGN 
DIGALMYPSY 
RGEVMPFKDR 
VQGQNVLHGY 
YPKMIAHDFP 



Seq ID NO: 394 DNA sequence 

Nucleic Acid Accession if: NM_014331.2 

Coding sequence: 1. .1506 



360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



60 
120 
180 
240 
300 
360 
420 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



60 
120 
180 
240 
300 
360 
420 



11 
I 



21 



31 



41 
I 



51 
1 



331 



wo 02/086443 

ATGGTCAGAA AGCCIGTTGT GTCCACCATC T0CAAAGGA6 GVTAOCtOCA GG6AAATGTT 60 

AACGGGAG6C TGCCTTCCCT GGGCAACAAG GAGOCAOCTG G6CAOGAGAA AGTGCAGCTG 120 

AAGAGGAAAG TCACTTTACT GAGGGGAGTC TCCATTATCA TTGGCACCAT CATTGGAGCA ISO 

GGAATCTTCA TCTCTCCTAA GGGOGTGCTC CAGAACAOCSG GCACCGTGGG CATGTCTCTG 240 

ACCATCTGGA CGGTGTGTGG GGTCCTGTCA CTATTTGGAG CTTTGTCTTA TGCTGAATTG 300 

GGAACAACTA TAAAGAAATC TGGAGGTCftT TAOVOITATA TTTTGGAAGT CTTTSGTCCA 360 

TTACCAGCTT TTGTAOGAGT CTGGGTGGAA CTCCTCATAA TAOSCCCTGC AGCTACTGCT 420 

GTGATATCCC TGGCATTTGG ACGCTACATT CTGGAACCAT TTTTTATTCA ATGTGAAATC 480 

CCTGAACTTG CGATCAAGCT CATTACAGCT GTGGGCATAA CTGTAGTGAT GGTCCTAAAT S40 

AGCATGAGTG TCAGCTGGAG OGCCCGGATC CAGATTTTCT TAACCTTTTG CAAGCTCACA 600 

GCAATTCTGA XAATTATAGT CCCT6GAGTT ATGCAGCTAA TTAAAGGTCA AAOGCAGAAC 660 

TTTAAAGACX5 OGTTTTCAGG AAGAGATTCA AGTATTAOGC GGTTCCCACT GGCTTTTTAT 720 

TATGGAATGT ATGCATATCC TGGCTGGTTT TAOCTCAACT TTGTTACTGA AGAAGTAGAA 780 

AACCCTGAAA AAACCATTCC CCTTGCAATA TGTATATCCA TGGCCATTGT GACCATTCGC BAO 

TATGTGCTGA CAAATGTGGC CTACTTTAOG ACCATTAATG CTGAGGAGCT GCTGCTTTCA 900 

AATGCAGTGG CAGTGACCTT TTCTGAGCGG CTACTGGGAA ATTTCTCATT AGCAGTTCOG 960 

ATCTTTGTTG CCCTCTCCTG CTTTGGCTCC ATGAACGGTG GTGTGTTTGC TGTCTCCAGG 1020 

TTATTCTATG TTGCGTCTOG AGAGGGTCAC CTTCXaGAAA TCCTCTCCAT GATTCATGTC 1080 

CGCAAGCACA CTCCTCTACC AGCTGTTATT GTTTTGCACC CTTTGACAAT G ATAATGCT C 1140 

TTCTCTGGAG ACCTGGACAG TCTTTTGAAT TTOCTCAGTT TTGOCAGGTG GCTTTTTATT 1200 

GGGCTGGCAG TTGCTGGGCT GATTTATCTT CGATACAAAT GOCCAGATAT GCAT CGTC CT 1260 

TTCAAGGTGC CACTGTTCAT CCCAGCTTTG TTTTCCTTCA CATGCCTCTT CATGGTTGCC 1320 

CTTTCCCTCT ATTCGGACCC ATTTAGTACA GG6ATTGGCT TOGTCATCAC TCTGACTGGA 1380 

GTCCCTGOGT ATTATCTCTT TATTATATGG GACAAGAAAC CCAGGTGGTT TAGAATAATC 1440 

TCAGAfiAAAA TAAOCAGAAC ATTACAAATA ATACTGGAAG TTGTACCAGA AGAAGATAAG 1500 

TTATGAACTA ATGGACTTGA GATCTrGGCA ATCTGOCCAA GGGGAGACAC AAAATAOGGA 1560 

TTTTTACTTC ATTTTCTGAA AGTCTAGAGA ATTACAACTT TGGTGATAAA CAAAAGGAGT 1620 

CAGTTATTTT TATTCATATA TTTTAGCATA TTCGAACTAA TTTCTAAGAA ATTTAGTTAT 1680 

AACTCTATGT AGTTATAGAA AGTGAATATG CAGTTATTCT ATGAGTCGCA CAATTCTTGA 1740 

GTCTCTGATA CCTACCTATT GGGGTTAGGA GAAAAGACTA GACAATTACT ATGTGGTCAT 1800 

TCTCTACAAC ATATGTTAGC AOGGCAAAGA ACCTTCAAAT TGAA6ACTGA GATTTTTCTG 1860 

TATATATGGG TTTTGTAAAG ATGGTTTtAC ACACTAC»GA TGTCTATACT GTQAAAAGTG 1920 

TTTTCAATTC TGAAAAAAAG CATACATCAT GATTATCGCA AAGAGGAGAG AAAGAAATTT 1980 

ATTTT A CATT GACATTGCAT TGCTTCCCCT TAGATAOCAA TTTAGATAAC AAACACTCAT 2040 

GCTTTAATGG ATTATACCCA GAGCACTTTG AACAAATXm: AGTGGGGATT GTTGAATACA 2100 

TTAAAGAAGA GTTTCTAGGG GCTACTGTTT ATGAGACACA TCCAGGAGTT ATGTTTAAGT 2160 

AAAAATCCTT GAGAATTTAT TATGTCAGAT GTTTTrTCAT TCATTATCAG GAAGTTTTAG 2220 

TTATCTGTCA ' ATm ' TT TT I TCACATCAGT TTGATCAGGA AAGTGTATAA CACATCTTAG 2280 

AGCAAGAGTT AGTTTGGTAT TAAATCCTCA TTAGAACAAC CACCTGTTTC ACTAATAACT 2340 

TACCCCTGAT GAGTCTATCT AAACATATGC ATTTTAAGCC TTCAAATTAC ATTATCAACA 2400 

TGAGAGAAAT AACCAACAAA GAAGATGTTC AAAATAATAG TCCCATATCT GTAATCATAT 2460 

CTACATGCAA TGTTAGTAAT TCTGAAGTTT TTTAAATTTA TGGCTATTTT TACACGATGA 2S20 

TGAATTTTGA CAGTTTGTGC ATTTTCTTTA TACATTTTAT ATTCTTCTGT TAAAATATCT 2580 

CTTCAGATGA AACT6TCCAG ATTAATTAGG AAAAGGCATA lATTAACATA AAAATTGCAA 2640 

AAGAAATGTC GCTGTAAATA AGATTTACAA CTGATGTTTC TA6AAAATTT CCACTTCTAT 2700 

ATCTAGGCTT TGTCAGTAAT TTCCACACCT TAATTATCAT TCAACTTGCA AAAGAGACAA 2760 

CTGATAAGAA GAAAATTGAA ATGAGAATCT GTGGATAAGT GTTTGTGTTC AGAAGATGTT 2820 

GTTTTGCCAG TATTAGAAAA TACTGTGAGC CGGGCATGGT GGCTTACATC TGTAATCCCA 2880 

GCACTTTGGG AGGCTGAGGG GGTGGATCAC CTGAGGTCGG GAGTTCTAGA CCAGCCTGAC 2940 

CAACATGGAG AAACCCCATC TCTACTAAAA ATACAAAATT AGCTGGGCAT GGTGGCACAT 3000 

GCTGGTAATC TCAGCTATTG AGGAGGCTGA GGCAGGAGAA TTGCTTGAAC CCGGGAGGOG 3060 

GAGGTTGCAG TGAGCCAAGA TTGCACCACT GTACTCCAGC CTGGGTGACA AAGTCAGACT 3120 
CCATCTCCAA AAAAAAAAAA AAAA 

Seq ID NO: 395 Protein sequence 
Protein Accession (f: NP_0S5146.I 

1 11 21 31 41 51 

I I I I I 1 

MVRKPWSTI SKGGYLQGNV NGRLPSLGNK EPPGQEKVQIj KRKVTLIiRGV SIIIGTIIGA 60 

GIFISPKGVL QNTGSVGMSL TIWTVCGVLS LFGALSYAEL GTTIKKSGGH YTYILEVFGP 120 

LPAFVRVWVE LLIIRPAATA VISLAFGRYI LBPFFIQCEI PELAIKLITA VGITWMVLN 180 

SMSVSWSARI QIFLTFCKLT AILIIIVPGV KQLIKGQTQN FKDAFSGRDS SITRLPIAFY 240 

YGMYAYAGWP YLNFVTEEVE NPEKTIPLAI CISMAITIGV YVLTNVAYFT TINAEELLLS 300 

KAVAVTFSER LLGNFSLAVP IFVALSCFGS MNGGVFAVSR LPYVASREGH LPEILSMIHV 360 

RKHTPLPAVI VLHPLTMIML FSGDLDSLLN FLSFARNLFX GLAVAGLIYL RYKCPDMKRP 420 

FKVPLFIPAL PSFTCLFMVA LSLYSDPFST GIGFVITLTG VPAYYLFIIM OKKPRSfFRIM 480 
SEKITRTLQI ILEWPSEDK L 



Seq ID NO: 396 CKA sequence 
Nucleic Acid Accesaion #: NM_006528 

Coding sequence; 57.. 764 

1 11 21 31 41 51 

I i I i i 1 

GCCGCCAGGG GCTTTCTCGG AOGCCTTGCC CAGOG6GCCG CCXX5ACCCCC TGCACCATGG 60 

ACCCCGCTCG CCCCCTGGGG CTGTCGATTC TGCTGCTTTT CCTGACGGAG GCTGCACIGG 120 

GCGATGCTGC TCAGGAGCCA ACAG6AAATA AC36GG6AGAT CTGTCTCCTG CCCCTAGACT 180 

AC6GACCCTG CCGGGCCCTA CTTCTCCGTT ACTACTAOSA CAGGTACAOG CAGAGCTGCC 240 

GCCAGTTCCT GTACGGGGGC TGCGAGGGCA ACGCCAACAA TTTCTACACC TGGGAGGCTT 300 

GCXy^OGATGC TTGCTGGAGG ATAGAAAAAG TTCCCAAAGT TTGCOGGCTG CAAGTGAGTG 360 

TGGACGACCA GTGTGAGGGG TCCACAGAAA AGTArrPCTT TAATCTAAGT TCCATGACAT 420 

GTGAAAAATT CTTTTCCGGT GGGTGTCACC GGAACCGGAT TGAGAACAGG TTTCCAGATG 480 

AA6CTACTTG TATGGGCTTC TGOGCACCAA AGAAAATTCC AICATTTTGC TACAGTCCAA 540 

AAGATGAGG6 ACTGTGCTCT GCCAATGTGA CTCX3CTATTA TTTTAATCCA AGATACAGAA 600 

CCTGTGATGC TTTCACCTAT ACTG6CTGTG GAGOGAATGA CAATAACTTT GTTAGCAGGG 660 
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AGGJlTTGCftA AOnGCATGT GCAAAAGCTT 
GCTTTGOCAG TAGAATCCGG AAAATTGQGA 
ATCrPGTTTG TCTTTATGGC TTATTTGCCT 
GCATGAGGAA ACAAATCATT GGTGATTTAT 
TTCAAAAATT TX3GATTTTTT TATATR TAAC 
TTTAATTTAT GGTTCAACTG TTTGTGAGAC 
AAATATGACT CACTCATTTC TTGGGGTCGT 
AAACAACATA AGACAATATA ATCATGTGCT 
CC 

Seq ID NO: 397 Protein sequence 
Protein Accession S: NP_006519 



T6AAAAAGAA 
AGAAGCAATT 
TTATGGTTGT 
TCACCAGTTT 
TAGCTGCTAT 
GAATTCTTGC 
ATTGCTGATT 
TTTAACRTAT 



AAASAAGATG 
TTAAACATTC 
ATCTGAAGAA 
TTATEAAIAC 
TCAAATGTGA 
AATGCATAAG 
TCAGAAGAGG 
TTGAGAATAA 



CCAAA6CTTC 
TTAATATGTC 
T3\ATATGACA 
AAGTCACTTT 
GTCTACCATT 
ATATAAAAGC 
ATCATAACTG 
AAAGGACTAG 



720 
780 
840 
900 
960 
1020 
1080 
1140 



PCT/US02/12476 



51 



1 11 21 31 41 

I i I i I i 

MDPARPLGLS ILLLPLTSAA LGDAAQ3PTG NNAEICLLPL DYGPOIALIA RYYYDRYTQS 
CRQFLYGGCE GNANNFYTWE ACDDACWRIE KVPKVCRIXJV SVDDQtEGST BiCyFFNLSSM 
TCEKFFSGGC HBNRIENRPP DEATOCPCA FKKIPSFCYS PKDEGLCSAN VTRYYFNPRY 
RTCDAFTYTG CGGNONNFVS REDCKRACAK AUCKRRXMPK LRFASRItUa RXXQF 

Seq 10 NO: 398 DNA sequence 

Nucleic Acid Accession #: 1IM_001S08.1 

Coding sequence: 1..1361 



1 
t 

ATGGCTTCAC 
CCCGAGTTTG 
TTCGTGATCG 
AAAGGATACT 
TTGGTGTTCC 
AOGTGCAGCT 
GCTACGCTGC 
TTCAGGTACA 
GTCACCTCCG 
GTGAACGTGC 
CAGCCCGAGA 
CAGTCCAGCA 
ATGTGCTGGA 
AOGCGGCCTC 
ACCATCATCT 
ATTCGGAGGA 
GCGTACATGA 
CCGCTCCTGT 

ACCACCGACA 
TCTGCAAGGA 
TCTAAGTCCC 
AATTCTGCTG 



11 
t 

CCAGCCTCCC 
AGGTGGCCAC 
GCCTTCTGGG 
TGCAGAAGGA 
TCATCGGCAT 
ACACCCTGTC 
TGCACGTGCT 
AGGCTGTGTC 
CCCTGGTGGC 
CCAGCCACCG 
CCTOCAATAT 
TCTTCGGGGC 
ACATGATGCA 
CGCAGCTGAG 
TCCTGAGGCT 
TCATGGCTGC 
TCCTCCTCCC 
ACACGGTGTC 
CGCT6CAGCA 
GCGCXX3GCTT 
GAACTGAGAA 
AGTCATTGAG 
CAGAGAATGG 



21 
I 

GGGCAGTGAC 
CTGGATCAAA 
GAACAGCGTC 
GGTGACAGAC 
GCCCATGGAG 
CTGCAAGCTG 
GACGCTCAGC 
GGGACCTTGC 
ACTGCCCTTG 
GGGTCTCACT 
GTGCATCTGT 
CTTCGTGGTC 
GGTGCTCATG 
GAAGTCCX3AG 
GATTGTTGTG 
GGCCAAACCC 
CTTCTCGGAG 
CTCGCAGCAG 
CGCCAACCAC 
TGTGCAGCGC 
GATTTTCTTA 
TCTCGAGTCA 
TTTTCAGGAG 



31 
I 

TGCTCCCAAA 
ATCACCCTTA 
ACCATTCGGG 
CACATGGTGA 
TTCTACAGCA 
CACACTTTCC 
TTTGAGOGCT 
CAGGTGAAGC 
CTGTTTGCCA 
TGCAACCGCT 
ACCAACCTCT 
TACCTCGTQG 
AAAAGCCAGA 
AGCGAAGAGA 
ACATTGGCOS 
AAGCACGACT 
ACGTTTTTCT 
TTTCGGCGGG 
GAGAAGCGCC 
COGTTGCrCT 
AGCACTTTTC 
CTAGAGCCCA 
CATGAAGTTT 



41 
I 

TCATTGATCA 
TTCTGGTGTA 
TCACCCAGGT 
GTTTGGCTTG 
TCATCTGGAA 
TCTTGGAGGC 
ACATOGCCAT 
TGCTGATTGG 
TGGGTACTGA 
CCAGCACOCG 
CCAGCXGCTG 
TCCTGCTCTC 
AGGGCTCGCT 
GCAGGAOOGC 
TAT6CT6GAT 
GGAOGAGGTC 
ACCTCAGCTC 
TGTTCGTGCA 
TGOGCGTACA 
TOGCGTCCCG 
AGAGCGAGGC 
ACTCAGGOGC 
GA 



Seq ZD MO: 399 Protein sequence 
Protein Accession #: NP_001499.l 



MASPSLPGSD 
KGYLQKEVTD 
ATLIiHVLTIiS 
VNVPSHRGLT 
KCWNMMQVLM 
IRRIMAAAKP 
CRLSLQHAKH 



11 
I 

CSQIIDHSHV 
HMVSLACSDI 
FERYIAICHP 
CNRSSTRHHE 
KSQKGSLAGG 
KHDVfTRSYPR 
EKRLRVHAHS 
LEPMSGAKPA 



21 

I 

PEFEVATWIK 
LVFLIGMPME 
FRYXAVSGFC 
QPETSNMSIC 
TRPPQLRKSE 
AYMILLPFSE 
TTDSARFVQR 
NSAAENGFQS 



31 

I 

ITLILVYLII 
FYSIIWNPLT 
QVKLLIGFVW 
TNLSSRWTVF 
SEESRTARRQ 
TPFYLSSVIN 
PLLFASBRQS 
HEV 



41 

1 

PVMGLLGNSV 
TSSYTLSCKL 
VTSACVALPL 
QSSIFGAFW 
TIIPLRLIW 
PLLYTVSSQQ 
SARRTBKIFL 



Seq ID NO: 400 DNA sequence 

Nucleic Acid Accession #: NM_00647S.I 

Coding sequence : 2 8 2 5 3 8 



1 
1 

AACAGAACTG 
TTGCTGCTTA 
AGTCGTATCA 
ACCAAAAAGA 
AAAACGACTG 
TGCCCAGCAG 
ACAACGCAGC 
TTCACTTACT 
GGTTTGGAGA 
AAGAGAA1GT 
TTGGGGCTTT 
ATCCATGGGA 
CAAATTGGTA 
GCAGCTGCCA 
TTTGCTCCCA 
GGAGACAAAG 
TCTGAGTCTA 



11 
I 

CAA06GAGAG 
TTGTTAACCC 
GGGGTCGG6A 
AATACTTCAG 
TTTTATATGA 
TTTTGCCCAT 
GCTATTCTGA 
TTGCACCGAG 
GCAAOSTGAA 
TGACCAAGGA 
TCATTAACCA 
ACCAGATTGC 
CCTCAATTCA 
TCACATCGGA 
CCAATGAGGC 
TGGCTTCCGA 
TTATG6GAGG 



21 
I 

ACTCAAGATG 
TATAAACGCC 
CCAAGGCCCA 
CACTTGTAAG 
ATGTTGCCCT 
TGACCATGTT 
CGCCTCAAAA 
TAATGAGGCT 
TGTTGAATTA 
CTTAAAAAAT 
TTATCCTAAT 
AACAAATGGT 
AGACTTCATT 
CATATT6GAG 
TTTTGAGAAA 
AGCTCTTATG 
AGCAGTCTTT 



31 
I 

ATTCCCTTTT 
AACAATCATT 
AATGTCTGTG 
AACTGGTATA 
GGTTATATGA 
TATGGCACTC 
CTGAGGGAGG 
TGGGACAACT 
CTGAATGCTT 
GGCATGATTA 
GGGGTTGTCA 
GTTGTCCATG 
GAAGCAGAAG 
GCGCTTGGAA 
CTTCCAOGAG 
AAGTACCACA 
GAGAOGCIG6 



41 

!• 

TAOCCATGTT 
ATGACAAGAT 
CCCTTCAACA 
AAAAGTCCAT 
GAATGGAAGG 
TGGGCATOGT 
AGATCGAGGG 
TG6ATTCTGA 
TACATAGTCA 
TTCCTTCAAT 
CTGTTAATTG 
TCATTGACCG 
ATQACCTTTC 
GAGAOGGTCA 
GTGTCCTAGA 
TCTTAAATAC 
AAG6AAATAC 



51 
I 

TTCTCTACTA 
CTTGGCTCAT 
GATTTTGGGC 
CTGTGGACAG 
AATGAAAGGC 
GGGAGCCACC 
AAAGGGATCC 
TATCOGTAGA 
CATGATTAAT 
GTATAACAAT 
TGCTCGAATC 
TGTGCTTACA 
ATCTTTTAGA 
CTTCACACTC 
AAGGTTCAT6 
TCTCCAGTGT 
AATTGAGATA 



60 
120 
180 



51 
I 

CAGTCAT6TC 
CCTGATCATC 
GCTGCAGAAG 
CTGGGACATC 
TCCCCTGACC 
CTGCAGCTAC 
CTGTCACCCC 
CTTCGTCTGG 
GTACCCCCTG 
CCACCACGAG 
GACOGTGTTC 
CGTAGCCTTC 
GGC06GGGGC 
CAGGAGGCAG 
GCCCAACCAG 
CTACTTCCGG 
GGTCATCAAC 
GGTGCTGTGC 
TGCGCACTCC 
GOGCCAGTCC 
CGAGCCCCAG 
GAAACCAGCC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



51 

1 

TIRVTQVLQK 
HTFLFEACSY 
LFAMGTEYPI. 
YLWLLSVAF 
TLAVCWMPNQ 
FRRVPVQVLC 
STFQSEAEPQ 



60 
120 
180 
240 

300 
360 
420 



60 

120 
180 
240 
300 
360 
420 
480 
S40 
600 
660 
720 
780 
840 
900 
960 
1020 
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GGATGTX3ACX3 GTCACAGTAT AACAGTAAAT GGAATCAAAA TGGTGAACAA AAAGGATATT 1080 

GTCACAAATA ATGGTOTGAT CCATTTGATT GATC3U3GTCC TAATTCCTGA TTCTGCC3VAA 1140 

CAAGTTAITG AGCTGGCTGG AAAACAGCAA ACCACCTTCA OGGATCTTGT GGCCCAATTA 1200 

GGCTTQGCAT CTGCTCTGAG GCCAGATGGA GAATACACTT TGCTGGCACC TGTGAATAAT 1260 

GCATTTTCTG ATGATACTCT CAGCATGGTT CAGOGOCTOC TfAAATTAAT TCTGCAGAAT 1320 

CACATATOA AAGTAAAAGT TGGCCTTAAT GftGCTTTAd AOGGGCAAAT ACTGGAAACC 1380 

ATCGGAGGCA AACAGCTCAG AGTCTTCGTA TATOGTACAG CTGTCTGCAT TGAAAATTCA 1440 

TGCATCGAGA AAGGGAGTAA GCAAGGGAGA AAOGGTGCGA TTCAC3VTATT OOGCGAGATC ISOO 

ATCAAGCCAG CAGAGAAATC CXTTCCATGAA AAGTTAAAAC AACATAAGOG CTTTAGCACC IS 60 

TTCCTCAGCC TACTTGAAGC TGCAGACTTG AAAGAGCTCC TGACACAACC TGGAGACTGG 1620 

ACATTATTTG TGCCAACCAA TGATGCTTTT AAG6GAATGA CTAGTGAAGA AAAAGAAATT 1680 

CTGATACGGG ACAAAAATGC TCTTCAAAAC ATCATTCTTT ATCACCTGAC ACCAGGAflTT 1740 

TTCATTCGAA AAGGATTTCA ACCTCCTGTT ACTAAC3VTTT TAAAGACCAC ACAAGGAAGC 1800 

AAAATCTTTC TGAAAGAAGT AAATGATACA CTTCTGGTGA ATGAATTGAA ATCAAAAGAA 1860 

TCTGACATCA TGACAACAAA TGGTGTAATT CATGTTGTAG ATAAACTCCT CTATCCAGCA 1920 

GACACACCTC TTCGAAATGA TCAACTGCTG GAAATACITA ATAAATTAAT CAAATACATC 1980 

CAAATTAAGT 'ri^rT C GTGG TAGCACCTTC AAAGAAATCC OCGTGACnn" CTATACAACT 2040 

AAAATTATAA CCAAACTTGT GGAACCAAAA ATTAAAGTGA TTGAAGGCAG TCTTCAGCCT 2100 

ATTATCAAAA CTGAAGGACC CACACTAAC3V AAAGTCAAAA TTGAAGGTGA ACCTGAATTC 2160 

AGACTGATTA AAGAAGGTCA AACAATAACT GAAGTGATCC ATGGAGAGCC AATTATTAAA 2220 

AAATACACCA AAATCATTGA TGGAGTGCCT GTGGAAATAA CTGAAAAAGA GACACGAGAA 2280 

GAAGGAATCA TTACAGGTCC TGAAATAAAA TACACTAGGA TTTCTACTGG AGGTGGAGAA 2340 

ACAGAAGAAA CTCKSAAGAA ATTGTTACAA GAAlGAGGTCA CCAAGGTCAC CAAATTCATT 2400 

GAAGGTGGTG ATGGTCATTT ATTTGAAGAT GAAGAAATTA AAAGACTGCT TCAGGGAGAC 2460 

ACACCCGTGA GGAA6TTGCA AGCCAACAAA AAAGTTCAAG GTTCTAGAAG AOaATTAAGG 2S20 

GAAGGTCGTT CTCAGTGAAA ATCCAAAAAC CAGAAAAAAA TGTTTATACA ACCCTAAGTC 2S80 

AATAACCTGA CCTTAGAAAA TTCTGAGAGC CAAGTTGACT TCA GGAACTG AAACATCAGC 2640 

ACAAAGAAGC AATCATCAAA TAATTCTGAA CACAAATTTA ATATTTTTTT TTCTGAATGA 2700 

GAAACATCAG GGAAATTGTG GAGTTAGOCT OCTGTGGTAA AGGAATTGAA GAAAATATAA 2760 

CACCTTACAC CCTTTTTCAT CTTGACATTA AAAGTTCTGG CTAACTTTGG AATCCATTAG 2820 

AGAAAAATCC TTGTCACCAG ATTCATTACA ATTCAAATCXJ AAGAGTTGTG AACTGTTATC 2880 

CCATTGAAAA GACCGAGCCT TGTATGTATG TTATGGATAC ATAAAAtGCA CGCAAGOCAT 2940 

TATCTCrCCA TGGGAAGCTA AGTTATAAAA ATAGGTGCTT GGTGTACAAA ACTTTTTATA 3000 

TCAAAAGGCT TTGCACATTT CTATATGAGT GGGTTTACTG GTAAATTATG TTATTTTTTA 3060 

CAACTAATTT TGTACTCTCA GAATGTTTGT CATATGCTTC TTGCAATGCA TATTrPTTAA 3120 

TCTCAAAC6T TTCAATAAAA CCATTTTTCA GATATAAAGA GAATTACTTC AAATTGAGTA 3180 
ATTCAGAAAA ACTCAAGATT TAAGTTAAAA AGTGGTTTGG ACTTGGGAA 

Seq ID KO: 401 Proceln sequence 
Protein Accesaion 8: NP_006466.1 

1 11 21 31 41 SI 

I I ] i t I 

MIPFLPMPSL UiLLIVNPXN ANNHYDKILA HSRiaGROQG PNVCALQQIL GTiOaCYFSTC 60 

KNWYKKSICG QKTTVLYECC PGYMRMEGMK GCPAVIiPIDH VYGTLGIVGA TTTQRYSDAS 120 

KLREEIEGKG SFTYFAPSNE AWDNLDSOIR RGLESNVNVE LLNALHSHMI NKRMLTKDLK 180 

KGMIIPSMYN NLGLPINHYP NGWTVNCAR IIHGNQIATM GWHVIDRVL TQIGTSIQDP 240 

lEAEDDLSSF RAAAITSDIL EALGRDGHFT LFAPTNEAPE KLPRGVLERF KGDKVASEAL 300 

MKYHILNTLQ CSESIMGGAV FETLECaiTIE IGCaXHJSITV NGIKMVNKKD IVTNNGVIHL 350 

IDQVLIPDSA KQVIEIiAGKQ QTTFTDLVAQ ZiGLASALRFD GSYTLLAPVll NAFSDOTLSM 420 

VQRLLKLILQ NHILKVKVGL NELYKGQILB TIGGKQLRVF VYRTAVCIEM SCMEKGSKQG 480 

RNGAIHIFRE IIKPAEKSLH EKLKQDKRPS TFLStLEAAD LKELLTQPGD WTLFVPTODA 540 

FKGMTSEEKE ILIRDKNALQ NIILYHLTPG VFIGKGFEPG VTNILKTTQG SiaFLKEVND 600 

TLLVNELKSK ESDIMTTNGV IHWDKLLYP ADTPVGNDQL LEILNKLIKY IQIKFVRGST 660 

FKEIPVTVYT TKIITKWEP KIKVIEGSLQ PIIKTB6PTL TKVKIEGEPE FRLIKEGETI 720 

TEVIHGEPIZ KKVTKIIDGV PVEITEKBTR EHillTGPEI KYTRISTGGG ETEETLKKUi 780 
QESVTKVTKP lEGGDGHLPE OEEIKRLLQG DTPVHKLQAN KKVQGSRRRL REGRSQ 

Seq ID NO: 402 DNA sequence 
Nucleic Acid Accesaion fi: NM_002416 
Codiz^ sequence: 40.. 4 17 

1 11 21 31 41 51 

ATCCAATACA GGAGTGACTT GGAACTCTAT TCTATCACTA TGAAGAAAAG TGGTGTTCTT 60 

TTCCTCTTGG GCATCATCTT GCTGGTTCTG ATTGGAGTGC AAGGAACCCC AGTAGTGAGA 120 

AAGGGTCGCT GTTCCTGCAT CAGCACCAAC CAAGGGACTA TCCACCTACA ATCCTTGAAA 180 

6ACCTTAAAC AATTTGCCCC AAGCCCTTCC TGCX3AGAAAA TTGAAATCAT TGCTACACTG 240 

AAGAATGGAG TTCAAACATG TCTAAACCCA GATTCAGCAG ATGTGAAGGA ACTGATTAAA 300 

AAGTGGGAGA AACAGGTCAG CCAAAAGAAA AAGCAAAAGA ATGG6AAAAA ACATCAAAAA 360 

AAGAAAGTTC TGAAAGTTCG AAAATCTCAA CGTTCTaSTC AAAAGAAGAC TACATAAGAG 420 

ACCACTTCAC CAATAAGTAT TCTGTGTTAA AAATGTTCTA TTTTAATTAT ACXMCTATCA 480 

TTCCAAAGGA GGATGGCATA TAATACAAAG GCTTATTAAT TTGACTAGAA AATTTAAAAC S40 

ATTACTCTGA AATTGTAACT AAAGTTAGAA AGTTGATTTT AAGAATCX:AA ACCTTAAGAA 600 

TTGTTAAAGG CTATGATTGT CTTTGTTCTT CTACCACCCA CCAGTTGAAT TTCATCATGC 660 

TTAAGGCCAT GATTTTAGCA ATACCCATGT CTACACAOAT GTTCACCCAA CCACATCOCA 720 

CTCACAACAG CTGCCTGGAA GAGCAOOOCT AGGCTTOCAC GTACTGCA6C CTGOVGAGAG 780 

TATCTGA6GC ACATGTCAGC AAGTCCTAAG CCTGTTAGCA TGCTGGTGAG CCAAGCAGTT 840 

TGAAATTGAG CTGGACCTCA CCAAGCPGCT GTGGCCATCA ACCTCTGTAT TTOAATCAGC 900 

CTACAGGCCT CACACACAAT GTGTCTGAGA GATTCATGCT GAT TGTT ATT GGGTATCACC 960 

ACTGGAGATC ACCAGTGTCT GGCTTTCAGA GCCTCCTTTC TGGCPTTGGA AGCCATGTGA 1020 

TTCCATCTTG CCC3GCTCAGG CTGACCACTT TATTTCTTTT TGTrCCCCTT TGCTTCATTC 1080 

AAGTCAGCTC TTCTCCATCC TACCACAATG CAGTGCCTTT CTTCrCTOCA GTGCACCTGT 1140 

CATATGCTCT GATTTATCTG AGTCAACTCC TTTCrCATCT TG TCOOC AAC ACCCCACAGA 1200 

AGTGCTTTCT TCTCCCAATT CATCCTCACT CAGTCCAGCT TAGTTCAAGT CCTGCCTCTT 1260 

AAATAAACCT TTTTCGACAC ACAAATTATC TTAAAACTCC TGTTTCACTT GGTTCAGTAC 1320 

CRCRTGGGTG AACACTCRAT GGTTAACTAA TTCTTGGGTG TTTATCCTAT CTCTCCAACC 1380 



334 
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AGATTGTCAG CTCCTTGAGC3 GCAAGACCCA CAGTATATTT CCCTGTTTCT TCCACAGTGC 1440 

CTAATAATAC TGTGGAACTA GGTTTTAATA ATTXTTTAAT TGATCTTGTT ATGGGCAGGA ISOO 

TGGCAACCAG AOCATTGTCT CAGAGCAGGT GCTOGCTCTT TCCTGGCTAC TCCATGTTGG 1S60 

CTAGCCTCTG GTAAOCTCTT ACTTATTATC TTCAGGACAC TCACTACAGG GACCAGGGAT 1620 

GATGCAACAT CLTrU ' l ' CA " l 'l' TTATGACAGG ATGTTTGCTC AGCTTCTOCA AC AATAA GAA 1660 

GCACGTGGTA AAACACTTGC GGATATTCTG GACIGTTrTT AAAAAATATA CAGT TTAOO G 1740 

AAAATCATAT AATCTTACAA TGAAAAGGAC TTTATAGATC AGCCAGT6AC CAACCTTTTC 1800 

CCAACCATAC AAAAATTCCT TTTCCOGAAG GAAAAGGGCT TTCTCAATAA GCCTCAGCTT 1860 

TCTAAGATCT AACAAGATAG CCACCGACAT CCTTATCGAA ACTCATTrTA GGCAAATATG 1920 

AGTrTTATTG TCCGTTTACT TGTTTCAGAG TTTGTATTGT GATTATCAAT TACCACACCA 1980 

TCTCCCATGA AGAAAGG6AA OGGTGAAGTA CTAAGGGCTA GAOGAAGGAG CCAAGTGGGT 2040 

TAGTGGAAGC ATGATTOCSTG CCCAGTTACC CTCTGCftGGA TGTGGAAACC TCCTTCCAGG 2100 

GGAGGTTCAG TGAATTGTGT AGGAGAGGTT GTCTGTGGCC AGAATTTAAA CCTATACTCA 2160 

CTTTCCCAAA TTCAATCACT GCTCACACTG CTGATGATTT AGAGTGCTGT COGGTGGAGA 2220 

TCXX:ACCCGA AOGTCTTATC TAATCATGAA ACTCCCTAGT TCCTTCATGT AACTTCCCTG 2280 

AAAAATCZAA GTGTTTCATA AATTTGAGAG TCTGTGACCC ACTTACCTTG CATCTCACAG 2340 

GTAGACAGTA TATAACTAAC AACCAAAGAC TACATATTGT CACTGACACA CACXTTTATAA 2400 

TCATTTATCA TATATATACA TACATGCATA CACTCPCAAA G CAAATAAT T TTTCACTTCA 2460 

AAACAGTATT GACTTGTATA CCTTGTAATT TCAAATATTT TCTTTGTTAA AATAGAATGG 2520 
TATCAATAAA TAGACCATTA ATCAG 

Seq ID KO: 403 Protein sequence 
Protein Accession ft*. HP^O 02407 

1 11 21 31 41 51 

I I I 1 t I 

^^KKSGVLFLb GIIUiVLIGV QGTPWRKGR CSCISTKQGT IHIiQSLKDLK QFAPSPSCEK 60 

IBIIATLKNG VOTCLNPDSA DVKELIKKWE KQVSQKKKQK NGKKHQKKKV LKVRKSQRSR 120 
QKRrr 

Seq ID NO: 404 DNA sequence 

Nucleic Acid Accession 8: NM_006670 

Coding sequence: 85.. 1347 

1 11 21 31 41 51 

111 III- 

CCGGCTCGCG CCCTCOSGGC CCAGCCTCCC GAGCCTTOQG AGCGGGOGCC GTCCCAGCCC 60 

AGCTCOSGGG AAACGCGAGC CGCGATGCCT GGGGGGTGCT CCCGGGGCCC CGCCGCCGGG 120 

GACGGGCGTC TGCGGCTGGC GCGACTAGCG CTGGTACTCC TGGGCTGGGT CTCCTCX3TCT 180 

TCTCCCACCT CCTCX3GCATC CTCCTTCTCC TCCTCGGCGC CGTTCCTGGC TTCOGCCGTG 240 

TCCGCCCAGC CCCCGCTGCC GGACCAGTGC CCCGCGCTGT GCGAGTGCTC CGAGGCAGCG 300 

CX^CACAGTCA AGTGCGTTAA CCGCAATCTG ACCGAGGTGC CCACGGACCT GCCCGCCTAC 360 

GTGCGCAACC TCTTCCTTAC CGGCAAOCAG CTGGCCGTGC TCCCTGCCS3G CGCCTTCGCC 420 

CGCCGGCCGC CGCTGGOGGA GCTGGCOGCG CTCAACCTCA GOGGCAGCCG CCTGGA06AG 480 

GTGCGCGCX3G GCGCCTTCGA GCATCTGCCC AGCCTGCGCC AGCTCGACCT CAGCCACAAC 540 

CCACTGGCCG ACCTCAGTCC CTTCGCTTTC TCGGGCAGCA ATGCX»GCX5T CTCGGCCCCC €00 

AGTCCCXTTTG TGGAACTGAT CCTGAACCAC ATCGTGCCCC CTGAAGATGA GCGGCAGAAC 660 

aSGAGCTTOS AGGGCATGGT GGTGGCGGCC CTGCTGGOGG GCCGTGCACT GCAGGGGCTC 720 

CGCC53CTTGG AGCXXWCCAG CAACCACTTC CTTTACCTGC CXjOSGGATGT GCTGGCCCAA 780 

CTX3CCCAGCC TCAGGCACCT GGACTTAAGT AATAATTOGC TGGTGAGCCT GAOCTAGGTG 840 

TCCTTCCGCA ACCTGACACA TCTAGAAAGC CTCCACCTGG AGGACAATGC CCTCAAGGTC 900 

CTTCACAATG GCACCCTGGC TGAGTTGCAA GGTCTACCCC ACATTAGGGT TTTCCTGGAC 960 

AACAATCCCT GGGTCTGCGA CTGCCACATG GCAGACATGG TGACCTGGCT CAAGGAAACA 1020 

GAGGTAGTGC AGGGCAAAGA CCGGCTCACC TGTGCATATC CGGAAAAAAT GAGGAATCGG 1080 

GTCCTCTTGG AACTCAACAC TGCTGACCTG GACTGTGACC CGATTCTTCC CCCATCCCTG 1140 

CAAACCTCTT ATCTCTTCCT GGGTATTGTT TTAGCCCTGA TAGGCGCTAT TTTCCTCCTG 1200 

GTTTTGTATT TGAACOGCAA 6GGGATAAAA AAGTGGATGC ATAACATCAG AGATGCCTGC 1260 

AGGGATCACA TGGAAGGGTA TCATTACAGA TATGAAATCA ATGOGGACCC CAGATTAACA 1320 

AACCTCAGTT CTAACTCGGA- TGTCTGAGAA ATATTAGAGG ACAGACCAAG GACAACTCTG 1380 

CATGAGATGT AGACTTAAGC TTTATCCCTA CTAGGCTTGC TCCACTTTCA TCCTCCACTA 1440 

TAGATACAAC GGACTTTGAC TAAAAGCAGT GAAGGGGATT TGCTTCCTTG TTATGTAAAG 1500 

TTTCTOGGTG TGTTCTGTTA ATGTAAGACG ATGAACAGTT GTGTATAGTG TTTTACCCTC 1560 

TTCTTTTTCT TGGAACTCCT CAACACGTAT GGAQGGATTT TTCAGGTTTC AGCAT6AACA 1620 

TGGGCTTCTT 6CTGTCTGTC TCTCTCTCAG TACAGTTCAA GGTGTAGCAA GTGTACCCAC 1680 

ACAGATAGCA TTCAACAAAA GCTGCCTCAA CTTTTTCGAG AAAAATACTT TATTCATAAA 1740 

TATCAGTTTT ATTCTCATGT ACCTAAGTTG TGGAGAAAAT AATTGCATCC TATAAACTGC 1800 

CTGCAGACGT TAGCAGGCTC TTCAAAATAA CTCCATGGTG CACAGGAGCA CCTGCATCCA 1860 

AGAGCATGCT TACATTTTAC TGTTCTGCAT ATTACAAAAA ATAACTTGCA ACTTCATAAC 1920 

TTCTTTGACA AAGTAAATTA CTTTTTf G AT TGCAGTTTAT ATGAAAATGT ACTGATTTTT 1980 

TTTTAATAAA CTGCATCGAG ATCCAACOGA CTGAATTGTT AAAAAAAAAA AAAAATAAAG 2040 
ATTCTTAAAA GAA 

Seq ID NO: 405 Protein sequence 
Protein Accession ft: MP_00666l 

1 11 21 31 41 51 

I I I 1 I I 

MPGGCSRGPA AGDGRLRLAR EAIiVLLGWVS SSSPTSSASS FSSSAPFLAS AVSAQPPLPD 60 

QCPALCECSE AARTVKCVNR NLTEVPTDLP AYVRNLFLTG NQLAVLPAGA PARRPPLAEL 120 

AALNLSGSRL DEVRAGAFEH LPSLRQLDIiS HNPLADLSPF APSGSNASVS APSPLVELIL 180 

NKIVPPEDER QllHSPEGMW AALLAGRALQ GLRRLELASN HFLYLPROVL AQLPSLRHLO 240 

LSNNSIjVSLT YVSFRNLTHL ESLHLEDifAIi KVLHNGTLAE LQGLPHIRVF LDNNPWVCDC 300 

HMADMVTWLK ETBWQGKDR LTCAYPEKMR NRVLLELNSA DLDCDPILPP SLQTSYVFLG 360 
IVLALIGAIF LLVIiYLNRKG IKKWMKNIRD ACRDHHEGYH YRYBINADPR LTNLSSHSDV 

seq ID KO: 406 DNA sequence 

Nucleic Acid Accession ft: Eos sequence 



335 



wo 02/086443 
Coding sequence : l . . 927 



a II 21 31 41 51 

] I i I I I 

ATGCCTGGGG GGTCCTCCOG GGGCCCOGCC GCCGGGGAOG GGCXTTCTGOG GCTGGCGCGA 60 

CTAGOGCTGG TACTCCTGGG CTGGGTCTCC TOGTCTTCTC CCACCTCCTC GGCATCCTCC 120 

TTCTCCrCCT CGGCGCCGTT CCTGGCTTCC GCOGTGTCOG CCCAGCCCCC GCTGCOGGAC 180 

CAGTCCCCOG CGCTGTGCXSA GTGCTCCGAG GCAGCGCGCA CAGTCAAGTG CGTTAACCGC 240 

AATCTGACCG AGGTGCCCAC GGACCTGCCC GCCTACGTGC GCAACCTCTT CCTTACCGGC 300 

AACCAGCTGG CCAGCAACCA CTTCCTTTAC CTGCCGOGGG ATGTGCTCGC CCAACTGCCC 360 

AGCCTCAGGC ACCTGGACTT AAGTAATAAT TOGCTGGTGA GCCTGACCTA CGTGTCCTTC 420 

CGCAACCTGA CACATCTAGA AAGCCTCCAC CTGGAGGACA AT GCCCTC AA GGTCCTTCAC 480 

AATCGCACCC TGGCTGAGTT GCAAGGTCTA CXXTCACATTA GGGTTTTCCT GQACAAC3UVT 540 

CCCTGGGTCT GOGACTGCCA CATGGCAGAC ATGGTGACCT GGCTCAAGGA AACAGAGGTA 600 

GTGCAGGGCA AAGACCGGCT CACCTGTGCA TATCGGGAAA AAAT6AGGAA TOGGGTCCTC 660 

TTGGAACTCA ACftGTGCTGA CCTGGACTGT GACCCGATTC TTCCCCCATC GCTGCAAACC 720 

TCTTATGTCT TCCTGGGTAT TGTTTTAGCC CTGATAGGCXS CTATTTTCCT CCTGGTTTTG 780 

TATTTGAACC GCAAGGGGAT AAAAAAGTGG ATGCATAACA TCAGAGATGC CTCCAGGGAT 840 

CACATGGAAG GGTATCATTA CAGATATGAA ATCAATGCX5G ACOCCAGATT AACAAACCTC 900 
AGTTCTAACT OGGATGTCCT CGAGTGA 

Seq ID KOi 407 Protein sequence 
Protein Accession ft: Eos sequence 

1 11 21 31 41 

I 1 ! I i 

KPGGCSRGPA ACDGRLRUVR LALVLLGWVS SSSPTSSASS PSSSAPFLAS 
QCPAIiCECSE AARTVKCVNR NIiTEVPTDLP AYVRNLFLTG NQLASNHFLY 
SLRELDLSNN SLVSLTYVSP RNLTBLSSLB LEDHALKVZJl MGTLAELQGIi 
PWVCDCKMAD MVTWLKETEV VQGXDRLTCA YPEKMRNRVL LELNSADLDC 
SYVFLGIVXiA LIGAIFLLVL yZHRKGIKKH MHKXimACRD HMEGYHYRYS 
SSNSDVIiE 

Seq ID MO: 408 DNA sequence 
Nucleic Acid Accession #: NM_00009S.l 
Coding sequence: 26. .2299 

1 11 21 31 41 51 

1 1 t I i i 

CAGCACCCAG CTCCCCGCCA CCGCCATGGT CCCCGACACC GCCTGOGTTC TTCTGCTCAC 60 

CCTCGCTGCC CTCGGCGCGT CCG6ACAGGG CCAGAGCCGG TTGGGCTCAG ACCTGGGCCC 120 

GCAGATGCTT CGGGAACTGC AGGAAACCAA CGOGGOGCTG CAGGACGTGC GGGRCTGGCT ISO 

GCG6CAGCAG GTCAQGGAGA TCACGTTCCT GAAAAACACG GTGATGGAGT GTGAC6CGTG 240 

CGGGATCCAG CAGTCAGTAC GCACCGGCCT ACCCAGCGTG CGGCCCCTGC TCCACTGCGC 300 

GCCCGGCTTC TGCTTCCCCG GCGTGGCCTG CATCCAGACG GAGAGCGGCG GCCGCTGCGG 360 

CCCCTCCCCC GCGGGCTTCA CGGGCAACGG CTCGCACTGC ACCGACGTCA ACGAGTGCAA 420 

CGCCCACCCC TGCTTCCCCG GAGTCCGCTG TATCAACACC AGCCCGGGGT TCCGCTGCGA 480 

GGCTTGCCCG CGGGGGTACA GCGGCCCCAC CCACCAGGGC GTGGGGCTGG CTTTCGCCAA 540 

GGCCAACAAG CAGGTTTGCA OGGACATCAA CGAGTGT6AG ACCGGGCAAC ATAACTGCGT 600 

CCCCAACTCC GTGTGCATCA ACACCCGGGG CTCCTTCCRG TGCG0CCO6T GC CRGC COGG 660 

CTTCGTGGGC GACCAGGCGT CCGGCTGCCA GOGCGGGGCA CAGCGCTTCT GCCCCGACQG 720 

CTCGCCCAGC GAGTGCCACG AGCATGCAGA CTGCGTCCTA GAGCGCGATG GCTCGOSGTC 780 

GTGCGTGTGT CGCGTTGGCT GGGCCGGCAA CGGGATCCTC TGTGGTCGCG ACACTGACCT 840 

AGACGGCTTC CCGGACGAGA AGCTGCGCTG CCCGGAGCCG CAGTGCCGTA AGQACA ACTG 900 

CSTGACrOTG GCCAACTCAG GGCAGGAGGA T6T6GACCGC GATGGCAT06 GAGAGGCCTG 960 

CGATOCGGAT GCCGACGGGG ACGGGGTCCC CAATGAAAAG GACAACTGCC CGCTGGTGOG 1020 

GAACCCAGAC CAGCGCAACA CGGACGAGGA CAAGTGGGGC GATGCGTGCG ACAACTGCCG 1080 

GTCCCAGAAG AACGACGACC AAAAGGACAC AGACCAGGAC GGCCGGGGCG ATGCGTGOGA 1140 

CGAGGACATC GACGGOGACC GGATCCGCAA CCAGGCCGAC AACTGCCCTA GGGTACCCAA 1200 

CTCAGAGCAG AAGGACAGTG ATGGCGATGG TATAGGGGAT GCCTGTGACA ACTGTCCCCA 1260 

GAAGAGCAAC COGGATCAGG CGGATGTGGA CCAOGACTTT GTGGGAGATG CTTGTCACAG 1320 

CGATCAAGAC CAGGATGGAG ACGGACATCA GGACTCTCCG GACAACTGTC CCAGGGTGCC 1380 

TAACAGTGCC CAGGAGGACT CAGACCAGGA TGGCCAGGGT GATGCCTGCG ACGACGAGGA 1440 

CGACAATGAC GGAGTCCCTG ACAGTCGGGA CAACTGCOCSC CTGGTGCCTA ACOOOGGCCA 1500 

GGAGGACGOG GACAGGGACG GCGTGGGCGA OGTGTGCCAG GACGACTTTG ATGCAGACAA 1560 

GGTGGTAGAC AAGATCGACG TGTGTCCGGA GAACGCTGAA GTCACGCTCA CCGACTTCAG 1620 

GGCCTTGCAG ACAGT0GT6C TGGACCCGGA GGGTGAOGCG CAGATTGACC CCAACTGGGT 1680 

GGTGCTCAAC CA6GGAAGG6 AGATCGTGCA GACAATGAAC AQGGACCCA6 GCCTGGCTGT 1740 

GGGTTACACT GCCTTCAATG GCGTGGACTT CGAGGGCAOG TTCCATGTGA ACACGGTCAC 1800 

GGATGACGAC TATGCGGGCT TCATCTTTGG CTACCAGGAC AGCTCCAGCT TCTACGTGGT 1860 

CATGTGGAAG CAGATGGAGC AAACX5TATTG GCAGGCGAAC CCCTTCCGTG CTGTGGC06A 1920 

GCCTGGCATC CAACTCAAGG CTGTGAAGTC TTCCACAGGC CCCGGGGAAC AGCTGCGGAA 1980 

OGCTCTGTGG CATACAGGAG ACACAGAGTC CCAGGTGCGG CTGCTGTGGA AGGACCOGOG 2040 

AAAOGTGGGT TGGAAGGACA AGAACTCCTA TOGTTGGTTC CTGCAGCACC GGCCCCAAGT 2100 

G6GCTACATC AGGGT6GGAT TCTATGAG6G CCCTGAGCTG GTGGCCGACA GCAAGGTGGT 2160 

CTTGGACACA ACCATGGGGG GTGGCCGCCT GGGGGTCTTC TGCTTCTCCC AGGAGAACAT 2220 

CATCTGGGCC AACCTGCGTT ACCGCTGCAA TGACACCATC CCAGAGGACT ATGAGACCCA 2280 

TCAGCTGCGG CAAGCCTAGG GACCAGGGTG AGGACCCGCC GGATGACAGC CAOCCTCAGC 2340 

GOGGCTGGAT GGGGGCTCTG CACCCAGCCC AAGGGGTGGC CGTCCTGAGG GGGAAGTGA6 2400 
AAGGGCTCAG AGAG6ACAAA ATAAAGTGTG TGTGCAGGG 

Seq ID MO: 409 Protein sequence 
Protein Accession #: NP_00OQ86.l 

1 11 21 31 41 51 

1 I I i I I 

MVFDTACVLli LTLAALGAS6 QGQSPLGSDL GPQMLREU2B TKAALQOVRO HLRQQVREIT 60 



51 

1 

AVSAQPPLPD 60 

LPRDVLAQLP 120 

PEXRVFLONN 180 

DPILPPSIiQT 240 

ZKADPRliTNL 300 



336 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

FLKKTVMECD AOGNQQSVST GLPSVSPUiB CAF6FCFF6V 
BGSHCTDVNE CHAHPCFPRV RCINTSPGFR CEACPPGYSG 
INBCETGQHN CVRISVCINT RGSFQCGPC3Q PGPVCTOASG 
ADCVLCRQGS RSCVCRVGWA GKGrLOGHDT VUDGFPDEKL 
EDVDRDGIGD AC31POADGDG VSWEKDNCPI* VRNPDQRNTD 
DTD(V)6RGDA OlDDISGDRI RNCZAONCPSV PNSOQXDSDG 
VDHDFVGDAC OSDQDQDGDG HQBSRDNCPT VPNSAQEDSD 
RDNCRLVPWP GQEDAD3DGV GDfVCQDDFDA DKWDKIDVC 
PEC31AQIDPN WWLKQGREI VQfTKSSDPGL AVGYTAFN6V 
PGYQDSSSFY WMWSQMEQT VWQANPFRAV ABPGIQUCAV 
ESQVRLLMKD PHMVGWKDKK SYRWFLQHRP QVGYIRVRFY 
RXiGVFCFSQB NIIHAHLRYR GNSTIPEDYE TKQLRQA 

Seq ID NOx 410 DKA sequence 

Nucleic Acid Accession S: IIM_001S65.1 

Coding sequence : 67 . . 363 



ACIQtGSOGR 
FTHQCVGLAF 
CQSGAQRFCP 
RCPEPQCRKD 
ESKKOSACDM 
DGIGDAOSNC 
HDGQGDACDD 
PaiAEVTLTD 
DFBGTFHVMT 



06PCPAGFTG 
AKANKQVCTD 



GAGACATTCC 
AGCACCATGA 
ATTCAAGGAG 
CCTGTTAATC 
CGTGTTGAGA 
TCGAAGCCCA 
TAAAACCAGA 
CCTCTCCCAT 
GTTACACTAA 
GGTTAATGTT 
GCTCTACTGA 
ACCTTTCCCA 
TCAGAATCTC 
ACTTCATGGA 
CATACAATTC 
CTTATTTAAT 
TTTCAGTGTA 
TTTTAAAAAT 
TTTTCAAATA 



XI 
I 

TCAATTGCTT 
ATCAAACTGC 
TACCTCTCTC 
CAAGGTCTTT 
TCATTGCTAC 
TCAAGAATTT 
GGGGAGCAAA 
CfkCTTCGCTA 
AAOGTGACCA 
CATCATCCTA 
GGTGCTATGT 
TCTTCCAAGG 
AAATAACTAA 
CTTCCACTGC 
CAAACACATA 
GAAAGACTGT 
CATGGAATAA 
ACAGATAGAT 
AAAATGAGGT 



21 
I 

AGACATATTC 
GATTCTGATT 
TAGAACCGTA 
AGAAAAACTT 
AATGAAAAAG 
ACTGAAAGCA 
ATGGATGCAG 
CATGGA6TAT 
ATGATGGTCA 
AGCTATTCAG 
TCTTAGTGGA 
GTACTAAGGA 
AAGGTATCCA 
CATCCTCCCA 
CAGGAAGGTA 
ACAAAGTATA 
CATGTAATTA 
ATATGCTCTG 
ACTCTCCTGG 



31 
I 

TGAGCCTACA 
TGCTGCCTTA 
OGCTGTACCT 
GAAATTATTC 
AAGGGTGAGA 
GTTAGCAAGG 
TGCTTCCAAG 
ATGTCAAGCC 
CCAAATCAGC 
TAATAACTCT 
TGTTCTGACC 
ATCTTTCTGC 
ATCAAATCTG 
AGGGGCCCAA 
GAAATATCTG 
AGTCTTAGAT 
AGTACTATGT 
CATGTTACAT 
AAATATTAAG 



E6PEI.VAD5II 



41 
I 

GCAGAGGAAC 
TCTTTCIGAC 
GCATCAGCAT 
CTGCAAGCCA 
AGAGATGTCT 
AAATGTCTAA 
GATGGACCAC 
ATAATTGTTC 
TGCTACTACT 
ACCXrrGGCAC 
CTGCTTCAAA 
TTTGGGGTTT 
CTTTTTAAAG 
ATTCTTTCAG 
AAAATGTATG 
GTATATATTT 
ATCAATGAGT 
AAGATAAAT6 



NCVTVPNSGQ 
CRSQKKDOQK 
PQKSNPDQAD 
DDDNDGVPDS 
FRAFQTWLD 
VTODOYAGPZ 
SNALKHTGDT 
WLDTTMROG 



51 
I 

CTCCAGTCTC 
TCTAAGTGGC 
TAGTAATCAA 
ATTTTGTCCA 
GAATCCAGAA 
AAGATCTCCT 
ACAGAGGCTG 
TTAiGTTTGCA 
OCTGTAGGAA 
TATAATGTAA 
TATTTCCCTC 
ATCAGAATTC 
AATGCTCTTT 
TCGCTTACCTA 
TGTAAGTATT 
CCrATATTGT 
AACAGGAAAA 
TGCTGAATGG 



Seq 10 NO: 411 Protein sequence 
Protein Acceaalon 8: MP_001556,1 



1 11 21 31 41 51 

I I i I t 1 

MHQTAXLXCC LIFLTLSGIQ GVPLSRTVRC TCISISNQPV NPRSI.EKLEI IPASQPCPRV 
EIIATMKKK6 EKSCLNPESK AimiLKAVS KEMSKSSP 



Seq ID NO: 412 DNA sequence 
Kucleic Acid Accession #: XM_0S7014 
Coding sequence: 143 . .874 



1 
I 

GGGAGGGAGA 
CGOGGCGGAG 
CGCTGCCCGG 
CCGOGGCCTC 
CCCCAAGGGG 
AATGTGCTTA 
CATTCCGGGT 
TCTGAGGGAA 
ATTGAATTAT 
AAATAGTGCT 
CTGTCAGCGT 
AGCTATAATT 
CACTTCTTCT 
CTCaSGTTGGC 
TTCTOGCATC 
TTTTTTTATT 
CATCTGAATG 
TTTAAATCTA 
TGGTTAGAAT 
GGTCTTTTGT 
TGTACAATTT 
CAACCTTAAA 



11 
I 

GAGGOGCGCG 
CCAGACGCTG 
CAGCCGGGAG 
CTGCTGCTCC 
AAGCAAAAGG 
CAAGGGGCAG 
ACACCTGGGA 
AGCTTTGAGG 
GGCATAGATC 
CTAAGAGTTT 
TGGTATTTCA 
TATTTGGACC 
GTGGAAGGAC 
ACTTGTTCAG 
ATTATTGAAG 
ATGCCTTGQA 
AAAAGCAAAG 
GCATtATTCA 

AcrrrcTTCA 

TTTTTCTCTT 
GTAAATGTTA 
AAAAAAAAAA 



21 
I 

GGTGAAAGGC 
ACCACGTTCC 
CCATGCGACC 
TGCTGCTGCA 
OGCAGCTCCG 
CAGGAGTGCC 
TCCCAGGTCG 
AGTCCTGGAC 
TTGGGAAAAT 
TGTTCAGTGG 
CATTCAATGG 
AAGGAAGCCC 
TTTGTGAAiC^ 
ATTACCCAAA 
AACTACCAAA 
ATGGTTCACT 
CTAAATATGT 
TTTTGCTTCA 
TAGTCACATT 
AGTATAGCAT 
AGAATTTTTT 
AAAA 



31 
I 

GCATTGATGC 
TCTCCTCGGT 
CCAGGGCCCC 
GCTGCCCGCG 
GCAGAGGGAG 
TGGTCGAGAC 
GGATGGATTC 
ACCCAACTAC 
TGGGGAGTGT 
CTCACTTCGG 
AGCTGAATGT 
TGAAATGAAT 
AATTQGTGCT 
AGGAGATGCT 
ATAAATGCTT 
TAAATGACAT 
TTACAGACCA 
ATCAAAAGTG 
CrCTCAACCT 
TTTTAAAAAA 
TTATATCTGT 



41 
I 

AGCCTGOGGC 
CTCCTCOGCC 
GCCGCCTCXX: 
CCGTCGAGCG 
GTGGTGGACC 
GGGAGCCCTG 
AAAGGAGAAA 
AAGCAGIGTT 
ACATTTACAA 
CTAAAATGCA 
TCAGGACCTC 
TCAACAATTA 
CGATTAGTGG 
TCTACTGGAT 
TAATTTTCAT 
TTTAAATAAG 
AAGTGTGATT 
GTTTCAATAT 
ATAATTTGGA 
ATATAAAAGC 
TAAATAAAAA 



Seq ID NOj 413 Protein sequence 
Protein Accession fi: XP_057014 

11 



120 
180 
240 
300 
360 
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480 
S40 
600 
660 
720 
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60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 



51 
1 

GGCCTCGGAG 
TCCAGCTCCG 
CXSCAbCGGCT 
CCTCTGAGAT 
TGTATAATGG 
GGGCCAATGG 
AGGGGGAATG 
CATGGAGTTC 
AGATGCGTTC 
GAAATGCATG 
TTCCCATTGA 
ATATTCATCG 
ATGTTGCTAT 
GGAATTCAGT 
TTGCTACCTC 
TTTATGTATA 
TCACACTGTT 
TTTTTTTAGT 
ATATTGTTGT 
TACCAATCTT 
TTATTTCCAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1060 
1140 
1200 
1260 



1 11 21 31 41 51 

11)11) 
MRPQGPAASP QRLRGLLLLL LLQLPAPSSA SEIPKGKQKA OLRQREWDL YNGMCLQGPA 
GVPGRDGSPG ANGIPGTPGX PGRDGFKGEK GECLRESFEE SWTPNYKQCS WSSUIYGIDL 
GKIAECTFTK MRSNSALRVL FSGSLRLKCR KACGQaWYFT FNGAECSGPL PIEAXIYZiDQ 
GSPEMNSTZN IHRTSSVE6L CEGIGAGLVD VAIWVGTCSD YPKGDASTGW NSVSRXIIEE 
LPK 



60 
120 
180 
240 
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Seq ID NO: 414 DNA sequence 

Nucleic Acid Accession S: XM_084007 

Coding sequence: 138.. 2405 

1 11 21 31 41 51 

} I 1 1 1 1 

CTCGTGCCGA ATTCGGCACG AGACOGOGTG TTCGCGCCTG GTAGAGATTT CTCGAAGACA 60 

CCAGTGGGCC CGTGTGGAAC CAAACCTGCG CX3CGTGGCCG GGCCGTGGGA CAACGAGGCC 120 

GCGGAGACGA AGGCGCAATG GCGAGGAAGT TATCTGTAAT CTTGATCCTG ACCTTTGCCC 180 

TCrCTGTCAC AAATCCCCTT CATGAACTAA AAGCAGCTGC TTTCCCCCAG ACCACTGAGA 240 

AAATTAGICC GAATTGGGAA TCIGGCftTTA ATGTTGACTT GGCAATTTCC ACACQGCAAT 300 

ATCATCTACA ACAGCTTTTC TACCGCTATG GAGAAAATAA TTCTTTGTCA GTT6AAQGGT 360 

TCAGAAAATT ACTTCAAAAT ATAGGCATAG ATAAGATTAA AAGAATCCAT ATACACCATG 420 

ACCACGACCA TCACTCAGAC CACGAGCATC ACTCAGACCA TGAGCGTCAC TCAGACCATG 480 

AGCATCACTC AGACCACGAG CATCACTCTG ACCATGATCA TCACTCCCAC CATAATCATG 540 

Cit i Ci■i'C1^ 3G TAAAAAtAAG CGAAAAGCTC TTTGCCCftGA CCATGACTCA GATAGTTCAG 600 

GTAAAGATCC TAGAAACAGC CAGGGGAAAG GAGCTCACOG ACCAG AACA T GCCAGTGGTA 660 

GAAGGAATGT CAAGGACAGT GTTAGT6CTA GTGAAGTGAC CTCAACTGTG TAC AACACT G 720 

TCTCTGAAGG AACTCACTTT CTAGAGACAA TAGAGACTOC AAGACCTGGA AAACTCTTOC 780 

CCAAAGATGT AAGCAGCTGC ACTCCACCCA GTGTCACATC AAAGAG COGG GTGAGCCGGC 840 

TGGCTGGTAG GAAAACAAAT GAATCTGTGA GTGAGCCCCG AAAAGGCTTT ATGTATTCCA 900 

GAAACACAAA TGAAAATCCT CAGGAGTGTT TCAATGCATC AAAGCTACTG ACATCTCATG 960 

GCATGGGCAT OCAGGTTCCG CTGAATGCAA CAGAGTTCAA CTATCTCTGT CCAGCCATCA 1020 

TCAACCRAAT TGRTGCTAGA TCTTGTCTGA TTCATACAAG TGAAAAGAAG GCTGAAATCC 1080 

CTCCAAAGAC CTATTCATTA CAAATAGCCT GGGTTGGTGG TTTTATAGCC AT TTCCAT CA 1140 

TCAGTTTCCT GTCTCTGCTG GGGGTTATCT TAGTGCCTCT CATGAATCGG GTGTTTTTCA 1200 

AATTTCTCCT GAGTTTCCTT GTGGCACTGG CCGTTGGGAC TTTGAGTGGT GATGCTTTTT 1260 

TACACCTTCT TCCACATTCT CATGCAAGTC ACCACCATAG TCATAGCCAT GAAGAACCAG 1320 

CAATGGAAAT GAAAAGAGGA CCACTTTTCA GTCATCTGTC TTCTCAAAAC ATAGAAGAAA 1380 

GTGCCTATTT TGATTCCACG TGGAAGGGTC TAACAGCTCT AGGAGGCCTG TATTTCATGT 1440 

TTCTTGTTGA ACATGTCCTC ACATTGATCA AACAATTTAA AGA TAAGAA G AAAAAGAATC 1500 

AGAAGAAACC TGAAAATGAT GATGATGTGG AGATTAAGAA GCAGTTGTGC AAGTAIGAAT 1560 

CTCAACTTTC AACAAAT6AG GAGAAAGTAG ATACAGATGA TCGRACTGAA GGCTATTTAC 1620 

GAGCAGACTC ACAAGAGCCC TCCCACTTTG ATTCTCAGCA GCCTGCAGTC TTGGAAGAAG 1680 

AAGAGGTCAT GATAGCTCAT GCTCATCCAC AGGAAGTCTA CAATGAATAT GTACCCAGAG 1740 

GGTGCAAGAA TAAATGCCAT TCACATTTCC ACGATACACT CGGCCAGTCA GACGATCTCA 1800 

TTCACCACCA TCATGACTAC CATCATATTC TCCATCATCA CCACCAOCAA AACCACCATC 1860 

CTCACRGTCA CAGCCAGCGC TACTCTCGGG AGQAGCTQAA AOATGCCGGC GTCGCCACTT 1920 

TGGCCTGGAT GGTGATAATG GGTGATGGCC TGCACAATTT CAGCGATGGC CTAGCAATTG 1980 

GTGCTGCTTT TACTGAAGGC TTATCAAGTG GTTTAAGTAC TTCTGTTGCT GTGTTCTGTC 2040 

ATGAGTTGCC TCATGAATTA GGTGACTTTG CTGTTCTACT AAAGGCTGGC ATGACOGTTA 2100 

AGCAGGCTGT CCTTTATAAT GCATTGTCAG CCATGCTGGC GTATCTTGGA ATGGCAACAG 2160 

GAATTTTCAT TQGTCATTAT GCTGAAAATG TTTCTATGTG GATATTTGCA CTTACTGCTG 2220 

GCTTATTCAT GTATGTTGCT CTGGTT6ATA TGGTACCTGA AATGCTGCAC AAT GATGCT A 2280 

GTGACCATGG ATGTAGCCQC TGGGGGTATT TCTTTTTACA GAATGCTGGG ATGCTTTTGG 2340 

GTTTTGGAAT TATGTTACTT ATTTCCATAT TTGAACATAA AATCGTGTTT OGTATAAATT 2400 

TCTAGTTAAG GTTTAAATGC TAGAGTAGCT TAAAAAGTTG TCATAGTTTC AGTAGGTCAT 2460 

AGGGAGATGA GTTTGTATGC TGTACTATGC AGCGTTTAAA GTTAGTGGGT TTTGTGATTT 2520 

TTGTATTGAA TATTGCTGTC TGTTACAAAG TCAGTTAAAG GTAOGTTTTA ATATTTAAGT 2 580 

TATTCTATCT TGGAGATAAA ATCTGTATGT GCAATTC^ GGTATTAOCA GTTTATTATG 2640 

TAAACAAGAG ATTT6GCATG ACATGTTCTG TATGTTTCAG GGAAAAATGT CTTTAATGCT 2700 

TTTTCAAGAA CTAACACAGT TATTCCTATA CTGGATTTTA GGTCTCTGAA GAACTGCTGG 2760 

TGTTTAGGAA TAAGAATGTG CATGAAGCCT AAAATACCAA GAAAGCTTAT ACTGAATTTA 2820 

AGCAAAGAAA TAAAGGAGAA AAGAGAAGAA TCTGAGAATT GGGGAGGCAT AGATTCTTAT 2880 

AAAAATCACA AAATTTGTTG TAAATTAGAG GGGAGAAATT TAGAATTAAG TATAAAAAGG 2940 

CAGAATTAGT ATAGAGTACA TTCATTAAAC ATTTTTGTCA GGATTATTTC CCGTAAAAAC 3000 

GTAGTGAGCA CTCTCATATA CTAATTAGTG TACATTTAAC TTTGTATAAT ACAGAAATCT 3060 

AAATATATTT AATGAATTCA AGCAATATAC ACTTGACCAA GAAATTGGAA TTTCAAAATG 3120 

TTCGTGCGGG TTATATACCA GATGAGTACA GTGAGTA6TT TATGTATCAC CAGACTGGGT 3180 

TATTGCCAAG TTATATATCA CCAAAAGCTG TATGACTGGA TGTTCTGGTT ACCTGGTTTA 3240 

CAAAATTATC AGAGTAGTAA AACTTTGATA TATATGAGGA TATTAAAACT ACACTAAGTA 33 00 

TCATTTGATT CGATTCAGAA AGTACTTTGA TATCTCTCAG TGCTTCAGTG CTATCATTGT 3360 

GA6CAATTGT CTTTATATAC 6GTACTGTAG CCATACTAGG CCTGTCTGTG GCATTCTCTA 3420 
GATGTTTCTT TTTTACACAA TAAATTCCTT ATATCAGCTT G 

Seq ID NO: 415 Protein sequence 
Protein Accession S: XP_084007 

1 11 21 31 41 51 

I I 1 ! I I 

MARKLSVILI LTFALSVTNP LHELKAAAFP QTTEKISPNW BSGINVDLAI STRQYHLQQL 60 

PmYGENNSL SVBGFRKLLQ NIGIDKIKRI HIHHDHDHHS DHEHHSDHER HSDHEHHSDH 120 

EHHSDHDHHS HHNHAASGKN KRKALCPDHD SDSSGKDPRN SQGKGAHRPE HASGRRNVKD 180 

SVSASEVTST VYNTVSEGTH FLETIETPRP GKLPPKDVSS STPPSVTSKS RVSRLAGRKT 240 

KESVSEPRKG FMYSRNTNEN PQECFNASKli LTSHGMGIQV PLNATEFNYL CPAIINQIDA 300 

RSCLIHTSEK KAEIPPKTYS LQIAWVGGFI AISIISPLSL LGVILVPLMN RVFFKFLLSF 360 

LVALAVGTLS GDAFLHL&PH SHASHHHSHS aEEPAKEMXR GPLFSKLSSQ NIEESAYFDS 420 

TWKGLTALGG LYPMPLVEBV LTLZKQFKDK KKKHQKKPEN DDDVBIKXQXj SKYESQLSTN 480 

EEKVDTDDRT EGYLRAOSQE PSHFDSQQPA VLBEEEVMIA KAHPQEVYNE YVFR6CRNKC 540 

HSHFHDTLGQ SODLIHHHHD YHHILHHMHH QMHHPHSHSQ RYSREELKDA 6VATIANMVI 600 

MGDGLHNFSD GLAIGAAFTE GLSSGLSTSV AVPCHELPHE LGDPAVLLKA CaflTVKQAVLY 660 

NALSAMLAYL GMATGIFIGH YAENVSMWIF ALTAGLFMYV ALVDKVPSIL BNOASDHGCS 720 
RHGYPFLQNA GMLLGFGIML LISIFEHKIV FRINF 

Seq ID NO: 416 DNA sequence 

Nucleic Acid Accession ft: NM_015419.1 

Coding sequence: 1..8467 
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1 XI 21 

I 1 1 

ATGCCCAAGC GOGCGCACTG GGGGGCCCTC 
CCGCGAGTGG OGCTGGCCTG CCOGCATOCT 
TGCACGTTCC GATOCCTGGC TTCCGTGCX3C 
AATTTGGGGT TTAATAGCAT ACAGGCOCTC 
TTGGAGCTAC TTATGATTCA CGGCAATGRG 
GACCTCAGCT CTCTTCAGGT TTTCAAGTTC 
CAGACCCTCC AGGGTCTCTC TAACTTAATG 
TTTATCCACC CTCAAGCTTT CAAOSGCTTA 
AATCrCCTCC AOCAGCTGCA CCOCAGCACC 
AGACTCTCCA CCATAAGGCA CCTCTACTTA 
AGCATGCTTC GGAACATGCC GCTTCTGGAG 
TGCGATTGTG AGATGAGATG UTm'l GGAA 
TGTAAAAAGG ACAAAGCTTA TGAAGGOGGT 
AAGTTGTACA AACATGAGAT ACACAAGCTG 
GAGTCCCCTC TGAGACAGAA CAGGAGCAOG 
GATGGTGGCA GCC3^CTCAT CCTGGAGAAA 
AATATGACCX3 ACGAGCACGG GAACATGGTG 
GATGTGTACA AGATTCACTT GAACCAAACG 
GTTGCCTTGG ACTTTGAGTG TCCAATGACC 
ATAGCATACT ACAGTGAAGT TCCOGTGAAG 
CCCAGAGTCA GCTACCAGTA CAGGCAGGAT 
GT6AGA6CCC AQATTCTTGC AGAACCAGAA 
CTGAACCX5AC GTCAGAGTAC GGCCAAGAAG 
CAAACAATAT CCACCAAAGA TACAAGGCAG 
CCTAGTG6AG CTGTGCAAAG AGATCAGACT 
TGCAAGGTGA AAGCTTCTGA GAGTCCATCT 
CT6AAAG0GC CCATGGATGA CCCAGACAGC 
AGGATCAAGT CCATGGAGCC ATCTGACTCA 
GATGAAATGG ACCGCATGGT ATATAGGGIA 
GAGAAAGACA CAGTGACAAT TGGCAA6AAC 
GCTTTAGCAA TACCCGAAGC CCACCTTAGC 
GATTTGGCTA ACACATCACA TGTATACATG 
GTCCAAGTCA GTGATAGTGG TTACTACAGA 
CATTTTACGG TGG6AATCAC AGTGACCAAG 
AGAGGCCCAG GTGCAAAGGC TCTTTCCAGA 
GGCTCGGGCA TGGGAGATGA AGAGAACACT 
GAGGTGTTCX: TCAAAACAAA GGATGATGCC 
AGAAGAAAGC TGAAACTCTG GAAGCATTCG 
GGTCGCAGAG TGTTTGAATC TAGACGAAGG 
GAGOGCTGGG CTGATATTTT AGCCAAAGTC 
GTACCCCCGT TGATTAAAAC CACAAGTCCT 
TTTCCTGCTG TTTCTCCCCC CTCAGCATCT 
TCCTCAGCAG ATGTACCTCT ACTTGGTGAA 
GCCAGCATGG GGCTAGAACA CAACCACAAT 
AGCACACCTC TGGAGGAAGT TGTTGATGAC 
ACTGAAGGAG ACCTGAAGGG GACAGCAGCC 
TCTOCTACTC TGCACACATT AGACACAGTC 
ACAGAGGGTT GGTCTGCAGC AGATGTTGGA 
GAGCCTCCAT TGGATGCTGT CTCCTTGGCT 
GATTTGGAGA CTAAGTCACA ACCAGATGAG 
CTTACTCCAA CCCCCACCAT CTGGGTTAAT 
TCTACTATAG GGGAACCAGG TGTCXTCAGGC 
ATCCACCTTG TGAAAAGTAG TCTAAGCACT 
AAAGAGATGT CTCAGACACT ACAGGGAGGA 
AGAA6TTCTG AGAGTGAGGG CCAAGAQAGC 
GGTATAATGA GCAGTATGTC TCCAGTTAAG 
CTAGACAAAG ACACCACAAC AGTAACAACA 
ACCATGAGCA CTCACCCTTC TCGAAGGAGA 
AAATTCCGCC ACCGGCACAA GCAAACCCCA 
TCTACTCAAC CAACTCAAGC AOCTGACATT 
GTTCCTACAG CTT6GGTGGA TAACACAGTT 
AATGCAGAAC CCACATCCAA GGGAACACCA 
CATCGATATA CCCCTTCTAC AGTGAGCTCA 
CCAGAAAATA AACATAGAAA CATTGTTACT 
ACTGTTTCTC TGAAAACTGA GGGCCCTTAT 
AAAATATATT CATCTTACCC TAAAGTCCAA 
TCAGATGGAA AAGAAATTAA GGATGATGTT 
ATTTTAGTCA CT6GTGAATC AATTACTAAT 
ACTATGGGA6 AATTTAAGGA AGAATCCTCT 
AATCCCTCAA GGACGGCCCA GCCTGGGAGG 
GGGGAAAATC TTACAGACCC TCCCCTTCTT 
GAGTTrTTGT CCTCTTTGAC AGTCTCCACA 
ACAACTCTCT CAAGCATAAA AGTGGAGGTG 
GATCAAGATC ATCTTQAAAC CACTGTGGCT 
CACACCXX:rA CTGCTGCCCG GATGAAGGAG 
ATGTCTTTGG GACAAACCAC CACCACTAAG 
GCATCTAGAG ATTCCAAGGA AAATGTTTTC 
GCAACCCCAG TCAACAATGA AGGAACACAG 
CCCTCTTCCG ACOGGGATGC ATTTAACTTG 
TTTGGTAGTA GGAGTCTACC ACG7GGCCCA 
GCTTCTCATC AACTAACCAG AGTCCCTGCC 
CTACCTGAAA TGTCCACACA AAGOGCTTCC 
CACTGGACCA ACAAA006GA AATAACTACA 
CAGTTTACAA CTCCAAGATT ATCAAGTACA 



31 41 51 

I t 1 

TCOGTGGTGC TGATCCTGCT TIGGGGCCAT 60 

TGTGOCTGCT AOGTCCC CA G OGAGGTCCAC 120 

GCTGGCATTG CTAGACAOGT OGAAAGAATC 180 

TCAGAAACCT CATTT6CAGG ACTGACCAAG 240 

ATCCCAAGCA TCCCOGATGG AGCTTPAAGA 300 

AGCTACAACA AGCTGAGAGT GATCACAGGA 360 

AGGCTGCACA TT6ACCACAA CAAGATOGAG 420 

AOGTCTCTGA GGCTACTCCA TTTGGAAGGA 480 

TTCTCCACGT TCAC A TTTTT GGATTATTTC S40 

GCAGAGAACA TGGTTAGAAC TCTTCCTGCC 600 

AATCTTEACT TGCAGGGAAA TCOGTGGACC 660 

TGGGATGCAA AATCCAGAGG AATTCTGAAG 720 

CAGTTGTGTG CAATGTGCTT CAGTCXIAAAG 780 

AAGGACAT6A CTTGTCTGAA GCCTTCAATA 840 

AGTATTGAGG AGGAGCAAGA ACAGGAAGAG 900 

TICCAACXGG CCCAGTGGAG CATCTCTTTG 960 

AACTTGGTCT GTGACATCAA GAAACCAATG 1020 

GATCCTCCAG ATATTGACAT AAATGCAACA 1080 

OGAGAAAACT ATGAAAAGCT ATGGAAATTG 1140 

CTACACAGAG AGCICATGCT CAGCAAAGAC 1200 

GCT6ATGAGG AAGCTCTTTA CTACACAGGT 1260 

TOGGTCATGC AGGCATCCAT AGATATCCAG 1320 

GTGCTACTTT OCTACTACAC OCAGTATTCT 1380 

GCTCGGGGCA GAAGCTGGGT AATGATTGAG 1440 

GTCCTGGAAG GGGGTCCATG CCAGTTGAGC 1500 

ATCTTCTGGG TGCTTCCAGA TGGCTCCATC 1560 

AAGTTCTCCA TTCTCAGCAG TGGCTGGCTG 1620 

GGCTTGTACC AGTGCATTGC TGAAGTGAGG 1680 

CTTGTGCAGT CTCCCTCCAC TCAGCCAGCC 1740 

CCAGGGGAGT OGGTGACATT GCCTTGCAAT 1800 

TGGATTCTTC CAAACAGAAG GATAATTAAT 1860 

TTGCCAAATG GAACTCTTTC CATCCCAAAG 1920 

TGTGTGGCTG TCAAOCAGCA AGGGGCAGAC 1980 

AAAOGGTCTG GCTTGCCATC CAAAAGAGGC 2040 

GTCAGAGAAG ACATOGTGGA GGATGAAGGG 2100 

TCAAGGAGAC TTCTGCATCC AAAGGACCAA 2160 

ATCAATGGAG AGAAGAAAGC CAAGAAAGGG 2220 

GAAAAAGAAC CAGAGACCAA TGTTGCAGAA 2280 

ATAAACATGG CAAACAAACA GATTAATCCG 2340 

CGTGGGAAAA ATCTCCCTAA ioGGCACAGAA 2400 

CCATCCTTGA GCCTAGAAGT CACACCACCT 2460 

CCTGTGCAGA CAGIAACCAG- TGCTGAAGAA 2S20 

GAAGAGCAC3G TTTTGGGTAC CATTTCCTCA 2S80 

GGAGTTATTC TTGTTGAACC TGAAGTAACA 2640 

CTTTCTGAGA AGACTGAGGA GATAACTTCC 2700 

CCTACACTTA TATCTGAGCC TTATGAACCA 2760 

TATGAAAAGC CCACCCATGA AGAGACGGCA 2820 

TG6TCACCAG AGCXX»CATC CAGTGAGTAT 2880 

GAGTCTGAGC CCATGCAATA CTTTGACCCA 2940 

GATAAGATGA AAGAAGACAC CTTTGCACAC 3000 

GACTCCAGTA CATCACAGTT ATTTGAGGAT 3060 

CAATCACATC TACAAGGACT GACAGACAAC 3120 

CAAGACACCT TACtGATTAA AAAGGGTATG 3180 

AATATGCTAG AGGGAGACCC CAiCACACTOC 3240 

AAATCCATCA CTTTGCCTGA CTCCACACTG 3300 

AAGCCTGCGG AAACCACAGT TGGTACCCTC 3360 

ACACCAAGGC AAAAAGTTGC TCOGTCATCC 3420 

CCCAACGGGA GAAGGAGATT ACGCCCCAAC 3480 

CCCACAACTT TTGCCCCATC AGAGACTTTT 3540 

AACATTTCAA GTCAAGTGGA GAGTTCTCTG 3600 

AATACCCCCA AACAGTTGGA AATGGAGAAG 3660 

CGGAGAAAAC ACGGGAAGAG GCCAAACAAA 3720 

AGAGCGTCCG GATCCAAGCC CAGCCCTTCT 3780 

CCCAGTTCAG AAACTATACT TTTGOCTAGA 3840 

GATTCCTTAG ATTACATGAC AACCACCAGA 3900 

GAGACACTTC CAGTCACATA TAAACCCACA 3960 

GGCACAAATG TTGACAAACA TAAAAGTGAC 4020 

GCCATACCAA CTTCTCGCTC CTTGGTCTCC 4080 

CCTGTAGGCT TTCCAGGAAC TCCAACCTGG 4140 

CTACAGACAG ACATACCTGT TACXZACTTCT 4200 

AAAGAGCTTG AGGATGTGGA TTTCACTTCC 4260 

CCATTTCACC AGGAAGAAGC TGGTTCTTCC 4320 

GCTTCAAGTC AGGCAGAAAC CACCACCCTT 4380 

ATTCTCCTTT CTGAAACTAG ACCACAGAAT 4440 

CCAGCATCCT CGTCOCCATC CACAATTCTC 4500 

CCAGCACTTC CCAGTCCAAG AATATCTCAA 4560 

TTGAATTATG TGGGGAATCC AGAAACAGAA 4620 

CATATGTCAG GGCCAAATGA ATTATCAACA 4680 

TCTACAAAGC TGGAATTGGA AAAGCAAGTA 4740 

GATAGCCAAC GCCAGGATGG AAGAGTTCAT 4800 

AAAOCCATCC TACCAACAGC AACAGTGAGG 4860 

AGATACTTTG TAACTTCCCA 6TCACCTCGT 4920 

TATCCTTCTG GGGCTTTGCC AGAGAACAAA 4980 

ACAATTCCTC TCCCATTGCA CATGTCCAAA S040 
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CCCAGCATTC CTAGTAAGTT TACTGAOOGA AGAACTGAOC AATTCAATGG TTACTCCAAA 5100 

GTGTTTOGAA ATAACAACAT CCCTGAGGCA AGAAA0CX3G TTGGAAAGCC TCOCAGTCCA 5160 

AGAATTCCTC ATTATTCCAA TGGAAGACTC CCTTTCTTTA CCAACAAGAC TCTTTCTTTT 5220 

CCACAGTTGG GAGTCACCOG GAGACCCCAG ATACCCACTT CTCCTGCCCC AGTAATGAGA 5280 

GAGAGAAAAG TTATTCCAGG TTCCTACAAC AGGATACATT CCCATACCAC CiTCCATCTG 5340 

GACTTTQGCC CTCOQ6CA0C TCOGTTGTTG CACACTCOGC AGACCAC6GG ATCA COCTCA 5400 

ACTAACTTAC AGAATATCCC TATGGTCTCT TCCACCCAGA GTTCTATCTC CTTTATAACA 5460 

TCTTCTGTCC AGTCCTCAGG AAGCTTCCAC CAGAGCAGCT CAAAtSTTCTT TGCAGGAGGA 5520 

CCTCCroCAT CCAAATTCTG CTCTCTTGGG GAAAAGCCCC AAATCCTCAC CAAGTCCCCA 5590 

CAGACTGTCT COGTCACCGC TGAGACAGAC ACTGTGTTCC CCTGTGAGGC AACAGGAAAA 5640 

CCAAAGGCTT TCGTTACTTG GACAAAGGTT TCCACAGGAG CTCTTATGAC TCOGAATACC 5700 

AGGATACAAC G GT TT G AGGT TCTCAAGAAC QGTACCTTAG TGATAOGGAA GGTTCAAGTA 5760 

CAAGATOGAG GCCAGTATAT GTGCACOGCC AGCAACCTGC AOGGCCTGGA CAGGA3GGTG 5820 

GTCTTGCTTT CGGTCACCGT GCAGCAACCT CAAATCCTAG CCTCOCACTA OCAGGAGGTC 5880 

ACTGTCTACC TGGGAGACAC CATTGCAATG GAGTGTCTGG OCAAAGGGAC OOCAGCOCCC 5940 

CAAATTTCCT GGATCTTCCC TGACAGGAGG GTGTGGCAAA CTGTGTOCCC OGTGGAGAGC 6000 

CGCATCACCC TGCACGAAAA CCGGACCCTT TCCATCAAGG AGGOGTCCTT CTCAGACAGA 6060 

GGCGTCTATA AGTGCGTGGC CAGCAATGCA GC0GGG6CGG ACAGCCTOGC CATCXX3CCTG 6120 

CACGTGGOQG CACTGCCCCC CGTTATOCAC CAGGAGAAGC TGGAfiAACAT CTCGCTGCCC 6180 

CCGGGGCTCA GCATTCACAT TCACTGCACT GCCAAQGCTG CGCCCXrTGCC CAG CGTGOGC 6240 

TGGGTGCTCG GGGACGGTAC CCAGATCCGC CCCTCGCAGT TCCTCCAOSG GAACTTGTTT 6300 

GTPTTCCCCA AOGGGACGCT CTACATCCGC AACCTCGCGC CCAAGGACAG CGGGCGCTAT 6360 

GAGTGCGTGG CCGCCAACCT GGTAGGCTCC GCGOGCAGGA CGGTGCAGCT GAAOGTGCAG 6420 

G6TGCAGCAG CCAA06G6CG CATCAGGGGC AOCXCCCCGC GGAG6ACGGA CGTCAGGTAC 6480 

G6AGGAACCC TCAAGCTGGA CTGCAGCGCC TGGGGGGACC CCTGGOOGOS CATCCTCTGG 6540 

AGGCTGCCGT CCAAGAGGAT GATCGACGCG CXCrTCAGTT TTGATAGCAG AATCAAGGTG 6600 

TTTGCCAATG GGACCCTGGT GGTGAAATCA GTGACXX3ACA AAGATGCCGG AGATTACCTG 6660 

IGCGTAGCTC GAAATAAGGT TGGTGATGAC TAOGTGGTGC TCAAAGTGGA TGTGGTGATG 6720 

AAACOGGCCA AGATTQAACA CAAGGAGGAG AACGACCACA AAGTCTTCTA CGGGGGTGAC 6780 

CTGAAAGTG6 ACTGTGTGGC CACOGGGCTT CCCAATCCCG AGATCTCCTG GAGCCTCCCA 6840 

GACGGGAGTC TGGTGAACTC CTTCATGCAG TCGGATGACA GCGGTGGACG CACCAAGCGC 6900 

"EATGTOjTCT TCAACAATGG GACACTCTAC TTTAACGAAG TGGGGATGAG GGAGGAAGGA 6960 

GACTACACCT GCTTTGCTGA AAATCAGGTC GGGAAGQACG AGATGAGAGT CAGAGTCAAG 7020 

GTGGTGACAG CGCCCGCCAC CATCCGGAAC AAGACTTACT TGGCGGTTCA GGTGCCCTAT 7080 

GGAGACGTGG TCACTGTAGC CTGTGAGGCC AAAGGAGAAC CCATGCCCAA GCTGACTTGG 7140 

rrGTCCCCAA CCAACAAGGT GATCCCCACC TCCTCTGAGA AGTATCAGAT ATACCAAGAT 7200 

GGCACTCTCX TTATTCAGAA AGCCCAG06T TCTGACAGCG GCAACTACAC CTGCCTGGTC 7260 

AGGAACAGOG CGGGAGAGGA TAGGAAGAOG GTGXX3GATTC AOGTCAAOGT CCAGCOVCCC 7320 

AAGATCAAOG GTAACCCCAA CCCCATCACC ACC3GTGCGGG AGATAGCAGC CGGGGGCAGT 7380 

CGGAAACTGA TTGACTGCAA AGCTGAAGGC ATCCCCACCC CGAGGGTGTT ATGGGCTTTT 7440 

OCCGAGGGTG TGGTTCTGCC AGCTCCATAC TATGGAAACC GGATCACTGT CCATGGCAAC 7500 

GGTTCCCTGG ACATCAGGAG TTTGAGGAAG AGCGACTCCG TCCAGCTGGT ATGCATGGCA 7560 

OGCAACGAGG GAGGGGAGGC GAGGTTGATC GTGCAGCTCA CTGTCCTGGA GCCCATGGAG 7620 

AAACCCATCT TCCACGACCC GATCAGCGAG AAGATCAOGG CCATGGCGGG CCACACCATC 7680 

AGCCTCAACT GCTCTGOOGC GGGGAC0C06 ACACCCAGCC TGGTGTQGGT CCTTCCCfMS 7740 

GGCACCGATC TGCAGAGTGG ACAGCAGCTG CAGOGCTTCT ACCACAAGGC TGACGGCATG 7800 

CTACACATTA GCGGTCTCTC CTCGGTGGAC GCTGGGGCCT ACCGCPGOGT GGCCCGCAAT 7860 

GCCGCTGGCC ACACGGAGAG GCTGGTCTCC CTGAAGGTGG GACTGAAGCC AGAAGCAAAC 7920 

AAGCAGTATC ATAACCTGGT CAGCATCATC AATGGTGAGA CCCTGAAGCT CCCCTGCACC 7980 

CCTCCCGGGG CTGGGCAGGG AOGTTTCTCC TGGAOGCTCC CCAATGGCAT GCATCTGGAG 8040 

GGGCOCCAAA CCCTGGGACG OGTTTCTCTT CTGGACAATG GCACCCTCAC GGTTCGTGAG 8100 

GCCTOGGTGT TTGACAGGGG TACXTTATGTA TGCAGGATGG AGA0GGA6TA OGGCCCTTOG 8160 

GTCACCAGCA TCCCCGTGAT TGTGATCGCC TATCCTCCCC GGATCACCAG OGAGCCCACC 8220 

CCGGTCATCT ACACCCGGCC CGGGAACACC GTGAAACTGA ACTGCATCGC TATGGGGATT 8280 

CCCAAAGCTG ACATCACGTG GGAGTTACCG GATAAGTCGC ATCTGAAGGC AGGGGTTCAG 8340 

GCTCGTCTGT ATGGAAACAG ATTTCrTCAC CCCCAGGGAT CACTGACCAT CCAGCATGCC 8400 

ACACAGAGAG ATGCCGGCTT CTACAAGTGC ATGGCAAAAA ACATTCTOGG CAGTGACTCC 8460 

AAAACAACTT ACATCCAOGT CTTCTGAAAT GTQGATTCCA 6AATGATTGC TTAGGAACTG 8520 

ACAACAAAGC GGGGTTTGTA AGGGAAGCCA OGTTGGGGAA TAGGAGCTCT TAAATAATGT 8580 

GTCACAGTGC ATGGTGGCCT CTGGTGGGTT TCAAGTTGAG GTTGATCTTG ATCTAC AATT 8640 

GTTGGGAAAA GGAAGCAATG CAGACACGAG AAGGAGGGCT CAGCCTTGCT GAGACACTTT 8700 

CTTTTGTGTT TACATCATGC CAGGGGCTTC ATTCAGGGTG TCTGTGCTCT GACTGCAATT 8760 

mxrc v c n i' tgcaaatgcc actcgactgc cttcataagc gtccatagga tatctgagga 8820 

ACATTCATCA AAAATAAGCC ATAGACATGA ACAACACXTTC ACTACCCCAT TGAAGACGCA 6880 

TCACCTAGTT AACCTGCTGC AGTTTTTACA TGATAGACTT TGTTCCAGAT TGACAAGTCA 8940 

TCTTTCAGTT ATTTCCTCTG TCACTTCAAA ACTCCAGCTT GCCCAATAAG GATTTAGAAC 9000 

CAGAGTGACT GATATATATA TATATATTTT AATTCAGAGT TACATACATA CA GCTAC CAT 9060 

TTTATATGAA AAAAGAAAAA CATTTCTTCC TGGAACTCAC TTTTTATATA ATGTTTTATA 9120 

TATATATTTT TTCCTTTCAA ATCAGACGAT GAGACTAGAA GGAGAAATAC TTTCTGTCTT 9180 

ATTAAAATTA ATAAATTATT GGTCTTTACA AGACTTGGAT ACATTACAGC AGACATGGAA 9240 

ATATAATTTT AAAAAATTTC TCTCCAACCT CCTTCAAATT CAGTCACCAC TGTTATATTA 9300 

CCTTCTCCA6 GAACCCTCCA GTGGGGAAGG CTGCGATATT AGATTTCCTT GTATGCAAAG 9360 

TTTTTGTTGA AAGCTGTGCT CAGAGGAGGT GAGAGGAGAG GAAGGAGAAA ACTGCATCAT 9420 

AACTTTACAG AATTGAATCT AGAGTCTTCC CCX3AAAAGCC CAGAAACTTC TCTGCAGTAT 9480 

CTGGCTTGTC CATCTGGTCT AAGGTGGCTG CTTCTTCCCC AGCCATGAGT CAGTTTGTGC 9540 

CCATGAATAA TACAOGAGCT GTTATTTCCA TGACTGCTTT ACTGTATTTT TAAGGTCAAT 9600 
ATACTGTACA TTTGATAATA AAATAATATT CTCTCAAAAA AAAAA 

Seq ID NO: 417 Protein sequence 
Protein Accession ^: NP_0S6234.1 

1 11 21 

I i i 

HPXRAHWGAL SWLILLWGH PRVALACPHP 
NLGFNSIQAL SBTSFAGLTK LELLMIfiGNE 
QTLQGLSNLH RLBIDHNKIE PIHPQAFNGL 
RLSTIRHLYL AENMVRTLPA SMLRMMPLI.E 
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I I 1 

CACWPSEVH CTFRSLASVP AGIARHV£RZ 60 

IPSIPDGAUt DLSSLQVFKP SYNKLRVITG 120 

TSLALLHLEG NLLHQLHPST FSTPTFLDYP 160 

NLYLQGNPHT CDCENHWFLE HDAKSRGILK 240 
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GOCDKAYBOG QLCAKCPSFK KLYKKEZHKL 
DGGSQIiILBK FQLFQUsiSIi KKTOEBQ^fV 
VALDFECPMT RENYEKLWKL lAYYSEVPVK 
VKAQILASPE WVJ«3PSI1)IQ UniaQSTAKX 
PSGAVQROQT VLEGGPOQLS QJVKASESPS 
R2KSKEPSDS GLYQCZAaVR DEKDBNVYBV 
ALAIPEAHZiS WILFNRRIZN DXiANTSEVYM 
HFTVGITVTK KGSGLPSKRG RRPGAKALSR 
EVFUCrXDDA INGDKKAKKG RRXLKUJKES 
ERWADILAKV BGKNIiPKGTE VPPLIKTTSP 
SSADVPLLGS EEHVLGTISS ASKGLEKNKN 
TEGDLXGTAA PTLISEPYEP SPTLHTLDTV 
EPPLDAVSLA BSEPMQYFDP DL&TKSQPOB 
STIGEPGVPG QSHLOGLTDN IHLVKSSLST 
RSSESEGQES KSITLPDSTL GIMSSMSPVK 
TMSTHPSRRR PNGRRRLRPN KFHHRHKQTP 
VPTAWVDNTV SITPKQLEMEK NAEPTSKGTP 
PEHKHSNIVT PSSBTILLPR TVSLKTEGPY 
SDGKEIRDDV ATNVDXHKSD ILVTGBSITN 
NPSRTAQPGR liQTDIPVTTS GENLTDPPLL 
TTLSSIKVEV ASSQAETTTL DQDHLSTTVA 
MSLGQTTTTK PALPSPRISQ ASRDSKENVP 
PSSOROAFNL STKLELEKQV FGSRSLPRGP 
LP&KSTQSAS RYFVTSQSPR HWTNKPBITT 
PSIPSKFTDR RTDQFNGYSK VPGNNNIPEA 
PQLGVTRHPQ IPTSPAPVMR ERKVIPGSYN 
TNLQNIPJ4VS STQSSISPIT SSVQSSGSFH 
OTVSVTAETD TVFPCEATGK PKPPVTWTKV 
QDRGQYWCTA SNLHGLORMV VLLSVTVQQP 
OISWIFPDRR VWQTVSPVES RITLHENRTL 
HVAALPPVIH QEKLENISLP PGLSZHIHCT 
VFPNGTLYIR NIAPKDSGRY ECVAANLVGS 
GGTLKLDCSA SQDPHPRILH RLPSKRMIDA 
CVAHNKVGDD YWIiKVDWM KPAKIEHKEE 
DGSLVNSFMQ SDDSGGRTKR YWFNKGTLY 
WTAPATIRN KTYLAVQVPY GDWTVACEA 
GTIiLZQKAQR SDSGHYTOiV R17SAGEDRKT 
RKLZDCKAEG ZPTPRVLWAF PEGWLPAPY 
RNEGGEARLZ VQLTVLEPME KPIFHDPZSE 
GTDLOSGQQL QRFYHKADGM LHISGLSSVD 
KQYHNLVSII NGBTLKLPCT PPGAGQGRFS 
ASVFDRGTYV CRMETEYGPS VTSZPVZVZA 
PKAOITWELP DKSHLKAGVQ ARLYGNRPLH 
KTTYIHVF 



KOfTCUCPSZ ESPIiROIISSS SIKKHQEQES 300 

NLVCDZKKPM DVYKIHI2IQT DPPDIDINAT 360 

LHREZJCSKD PRVSYQYRQD ADEEALYYTG 420 

VLLSYYTQYS QTISTKDTRQ ARGRSWVMIE 480 

ZFt<VX*PDGSZ UCAPMDDPOS KPSIIiSSGWL S40 

LVQSPSTQPA EKDTVTIGKH PGESVTIfQI 600 

LPNGTLSIPX VQVSDSGYYR CVAVKQQ6AD 660 

VREDZVEDSG GSOQGDEENT SRRLLHPKDQ 720 

EKEPETNVAH GRRVFESRRR INJ4ANKQINP 780 

PSLSLEVTPP FPAVSPPSAS PVQTVTSAEE 840 

GVILVEPSVT STPLEEWDD LSEKTEEITS 900 

YEKPTHEETA TEGWSAADVG SSPEPTSSEY 960 

DXMKEDTFAH LTPTPTZWVN DSSTSQLFED 1020 

QDTIiLZKRGM KEMSQTLQGO NMLEGDPTBS 1080 

KPAETTVGTL LDKDTTTVTT TPRQKVAPSS 1140 

PTTFAPSETF STQPTQAPDI KIS^JJVESSL 1200 

RRKHGKHPNK HRYTPSTVSS HASGSKPSPS 1260 

DSLDYMTTTR KIYSSYPKVQ ETLPVTYKPT 1320 

AZFTSRSLVS TKGEPKEBSS PVGFPGTPTW 1380 

KELEDVOFTS EFLSSLTVST PFEQKEAGSS 1440 

ILLSETRPQN ETP7AARMKE PASSSPSTZL 1500 

IiNYVOIPETE ATPVNKEGTQ HMSGPKELST 1560 

DSQRQDGRVH ASHQLTRVPA KPILPTATVR 1620 

YPSGALPENK QPTTPRI>SST TIPLPLHMSK 1680 

RNPVGKPPSP RIPKYSNGRL PFFTNKTLSF 1740 

RIHSHSTFHL DFGPPAPPZJi HTPQfTTGSPS 1800 

QSSSKFFAGG PPASKFWSIiG EKPQZLTKSP 1860 

STGAIJ4TPNt RZQRFEVLKN GTLVZRKVQV 1920 

QltASHYQDV TVYLOrriAM ECLAKGTPAP 1980 

SIKEASFSDR GVYKCVASNA AGADSLAIRL 2040 

AKAAPIjPSVR WVLGDGTQZR PSQFLHGm>F 2100 

ARRTVQUIVQ RAAANARITG TSPRRTDVRY 2160 

LFSFDSRIKV FAMGTLWKS VTDKDAGDYL 2220 

NDHKVFYGGO LKVDCVATGL PNPEISWSLP 2280 

FNEV01REEG DYTCFAENQV GKDEMRVRVK 2340 

KGEPMPKVTW LSPTSKVIPT SSEKYQIYQD 2400 

VWIHVNVQPP KINGNPHPIT TVREIAAGGS 2460 

YGNRITVHGN GSLDIRSLRK SDSVQLVCMA 2520 

KITAMAGHTZ SLNCSAAGTP TPSLVWVLPN 2S80 

AGAYRCVARN AAGHTERLVS IiKVGLKPEAN 2640 

HTLFNGKHLE GPQTLGRVSL LDNGTIiTVRE 2700 

YPPRZTSBPT PVIYTRPGNT VKLN01AKGZ 2760 

PQGSLTIQKA TQRDAGFYKC HAKNILGSDS 2820 



Seq 10 NO J 418 ONA sequence 

Nucleic Acid Accession ft: Bos sequence 

Coding sequence : l . . 5001 

1 11 21 31 41 51 

I I 1 1 ! I 

ATGCCAGGCA CAAAACTAAC CCGAACAG6C GCCCCA6CA6 AC7ACAGAGT GATATTGAAG 60 

ACCTCTCAAG AGGACGAATT GGATGTACCT GAOQACATCA GCGTCCGGGT TATGTCATCT 120 

CAGTCTGTGC TTGTGTCCTG GGTG6ATCCT GTTCTGGAAA AACAGAAGAA AGTTCTTGCA 180 

TCAAGACAGT ACACOGTGCG CTATCGAGAG AAGGGGGAAT TGGCCAGGTG GGATTATAAG 240 

CAGATCGCTA ACAGGCGTGT GCTGATTGAG AACCTGATTC CAGACACTGT GTATGAATTT 300 

GCAGTCCGTA TTTCACAGGG TGAAAGAGAT GGCAAATGGA GTACGTCAGT CTTCCAAAGA 360 

ACACCAGRAT CTGCCCCTAC CACAGCTCCT GAAAACTTGA AGGTCTGGCC AGTCAATGGC 420 

AAACCTACAG TTGTOGCTGC ATCTT6GGAT GOGCTACCAG AGACTGAGGG GAAAGTGAAA 480 

GTCTGTCTGC TGGACACAGG ACTGTTTTCA GTTTCCTCCT TOCAACCATC TGCCAAATCA 540 

TTTCAGAATA CATTCTTTCA TACGCCCCGG CTCTCAAACC ATTTGGAGCA AAGTCCCTCA 600 

CCTATCCTGG AGACACTACT TCTGCCCTGG TGGATGGTCT GCAGCCTGGG GAACGCTATC 660 

TTTTCAAAAT COGGGCCACA AACAGGAGAG GCCTGGGACC TCACTCCAAA GCCTTCATTG 720 

TCGCTATGCC AACAAGAATG CAGCTGTACC CAGAAGGATT TCAGTTGTCT AGCTTACCTG 780 

ATCQATATCC AAAOCAAAGA AGTTAATAAA GATCCACAAC TGGAAGGGAG TGTrTTTGGA 840 

CCATGTTTTC TTTTCTACTT CCTCACATTT ATGCTG6ATA TTGGaSGCTT TTCCTTCATT 900 

ATGTGCTATG AAGACCCANN TGTTTCTTCT TTGACAGGCA ATTCTTTAAA ATCTGTTGCA 960 

GCCAGTAAGG CGGATX3TTCA GCAGAACAOG GAGGACAATG GGAAACCCGA AAAACCTGAG 1020 

CCTTCCTCAC CTTCTCCCAG AGCTCCAGCT TCCTCCCAAC ACCCCTCTGT GCCTGCTTCT 1080 

CCCCAAGGGA GAAATGCCAA GGACCTTCTT CTTGACTTGA AGAACAAAAT ATTGGCTAAT 1140 

GGTGGGGCGC CCOGAAAACC CCACCTTCGC GCCAAGAAGG CAGAGGAGCT GGATCTTCAG 1200 

TCGACAGAAA TCACTGGGGA GGAGGAGCTG GGTTCCOGGG AGGACTOGCC CATGTCACCC 1260 

TCAGACACCC AAGACCAGAA ACGGACCCTG AGGCCGCCAA GTAGACAOGG CCACTCX3GTG 1320 

GTTGCTCXXX3 GCAGGACTGC AGTGAGGGCC CGGATGCCAG OGCTGCCCCG AAGGGAAGGC 1380 

GTAGATAAGC CTGGCTTTTC CCTGGCCACG CAGCCCCGCC CAGGGGOSCC CCCCTCGGCT 1440 

TCGGCCTCTC CTGCCCACCA CGCGTCCACC CAGGGCACCT CTCATOGTCC TTCCCTGCCT 1500 

GCCAGCTTGA ATGACAACGA CTTGGTGGAC TCAGACGAAG ATGAGOGCGC TGTGGGCTCC 1560 

CTCCACCCCA AGGGOGCCTT CGCCCAGCCC CGGCCAGCCC TCTCCCCCAG COGCCAGTCC 1620 

C06TCCAG0G TTCTCCX3CGA CAGAAGCTCT GT6CACCCCG GOGCAAAGCC AQCCTCGCCG 1680 

GOGCOGAGGA CCCCCCATTC" AGGGGCOGCA GAGGAAGATT CCAGTGCCTC AGCCCCACCC 1740 

TCAAGACTTT CTCCACCCCA TGGGGGATCA TCTCGGCTGC TGCCCACCCA GCCACACCTG 1800 

AGCTCTCCAC TTTCCAAGGG CGGGAAGGAT GGTGAGGAOG CCCCAGCCAC CAACTCCAAT 1860 

GCGCCATCAC GGTCCACCAT GTCCTCCTCC GTCTCTTCrC ATCTCTCGTC CAGGACGCAG 1920 

GTCICTGAGG GA6GGGAGGC TTCTGATGGT GAAAGCCAOG GTGACGGCGA TAGGGAAGAC 1980 

6G0GGAAG6C AGGCGGAGGC CA06GCCCA6 AGGCIGGGGG CCOQGCCTGC CTCTGGACAC 2040 

TTCCATTTGC TCAGACACAA ACCCTTTGCT GOCAAOGGGA GGTCTCCAAG CAGGT TCAG C 2100 

ATTGGGOGGG GACCTOGGCT GCAGCCCTCC AGCTCCCCRC A6T0GACTGT GCCCTCOCGA 2160 
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GCCCRCCOCa GGCrrccCTC TCACTCTGAT TCCCAOCCTA AGCTOVGCTC AGGTATCCAT 2220 

GGAGAGGAOG AGGATGftGAA GCOGCTTCCT GCCA.GC5TT6 TCAATGACCA OGTGCCTTCC 2280 

TCCrCCAGGC AGOXaTCTC COGGGGCTCG GflGGACTTAA GGAGAAGCCC GCAQACAGGG 2340 

GCCAGCCTGC ATCGGAAGGA ACCCATCCCA GAGAACCCCA AATCCACAGG GGCAGATACA 2400 

CATCCICAGG GCAAGTACTC CTCCCTGGCX: TCCAAGGCTC AGGATGTTCA AC3VGAGC3VCA 2460 

GAOBOGGftCA CGGAIC5GGTCA TTCTCCCAAA GCACAGCCAG GGTCCACAGA CCGCCACGOS 2520 

TCCCCTGCTC GTCCTCCCGC AfiCACGGTCA CAGCSUGCATC CCAGTGTTCC CAGAAGGATG 2580 

ACACCOGGCC GGGCCCCAGA ACAGCAGOX CCTCCTCCOG TOGCCACGTC CCAGCACCAC 2640 

CCGGGACCCC AGAGCRGAGA OGCX5GGTOGG TCACCTTCCC AGOCCAGGCT CTCACTGACC 2700 

CAGGCCGGGC GGCCCOGCCC CAGGTCGCAG GGCOGCTOCC ACICCTOCTC GGACXXTTPAC 2760 

AOGGCGAGCT CCAGAGGGAT GCTCCCCACG GCCCTCCAGA AOCAGGACGA GGAIGCCCAG 2820 

GGCAGCTAOG ACGACGACAG CACAGAAGTC GAGGOXAGG ATGTCOGGGC CCCOSCGCAC 2880 

GCCGCGOGO: CCAAGGAGGC AGCTGCGTCC CTTCOCAAGC ACCAGC3«3GT GGACSTCTCCC 2940 

ACAGGGGCAG GGGCAGGTGG 06ACCACAG6 TCCCACGGGG GACATGOGGC CTCCCCOGCC 3000 

AGGCCCAGCC GACCOGGOGG CCCCCAGTCC OGOSCCCGCG TCCCCAGCAG GGCAGOGCGG 3060 

GGGAAGTOGG AGCCTCCTTC CAAGCGGCCC CTGTCCTCCA AGTCCCAGCA GTCG GTCTCA 3120 

GCCGAGGACE AGGAGGAGGA GGACGOSGGG TTrfTTAAAG GCGGGAAAGA AGACCTTCPG 3180 

TCTTCCTCIG TGCCAAAGTG GCCCTCTTCC TCCACTCCCA GGGGCGGCAA AGACGCCGAT 3240 

GGGA0GCTO3 CCAAGGAAGA GAGGGAGCCT GCCATCG<XC TTGOXCTOG CGGAGGGAGC 3300 

CTGGCTCCro TGAAGCGACC TCTCCCCCCA CCTCCAGGCA GCTCCCCCAG GGCCTCCCAC 3360 

GTCCCTTCCC GACOGCCGCC TCGCAGOGCT GCCACCGTGA GCCGCGTC6C GGGCACCCAC 3420 

CCCTCGCCGC GGTACACCAC GCGOGCCCCV CCXGGCCACT TCTOCACCAC CCOGATGCTC 3460 

TCCTTGOGCC AGAGGATGAT GCATGCCAGA TTCOGTAACC CTCTCTCCOG ACACCCTGCC 3540 

AGACCCTCTT ACAGACAAGG TTATAATGGC AGACCAAATG TAGAAGGGAA AGTCCTTCCT 3600 

GGTAGTAATG GAAAACCGAA TGGACAGAGA ATTATCAATG GCCCTCAAGG AACAAAGTGG 3660 

G T T G T GG ACC TTGATCGTGG GTTAGTATTG AATGCAGAAG GAAGGTACCT CCAAQATTCA 3720 

CATGGAAATC CTCTTCGGAT TAAACTAG6A GGAGATGGTC GAACCATTCT AGATCTGGAA 3780 

GGGACCCCOG TCGTCAGTCC TGACGGCCTC' CCACTCTTTG GGOWSGGGCG ACATGGCACA 3840 

CXnCTGGCCA ATCCCCAAGA TAAGCCAATT rTGAGTCTTG GAGGAAAGCC GCTGGTGGGC 3900 

TTGGAGGTCA TCAAAAAAAC CAOCCATCCC CCTACCACTA CCATGCAGCC CACCACTACT 3960 

ACGACGCCCC TGOCTACCAC TAC3UVCCCCG AGGCCCACCA CTGCCACCAC CATGCAGCCC 4020 

ACCACTACTA CXSACGCCCCT GCCTACCACT ACACCGAGGC CCACCACTGC CACCACCCGC 4080 

GGCACXSACCA 0CAGG05TCC AACAACCACA GTCOGAACCA CTACX30GGAC AACCACCACX: 4140 

ACCACCCCCA AACCCACCAC TCCCATCCCC ACCTGTCCCC CTGGGACCTT GGAAOGGCAC 4200 

GAGGATCATG GCAACCTGAT AATGAGCTCC AATG6GATCC CAGAGTGCTA OGCTGAAGAA 4260 

GATGAGTTCT CAGGCTTGGA GACTGACACT GCAGTACCTA CGGAAGAGGC CTACGTTATA 4320 

TATGATGAAG ATTATGAATT TGAGACGTCA AGGCCACCAA CCACCACTGA GCCTTCGACC 4380 

ACTGCTACCA CACOyiGGGT GATCCCAGAG GAAGGCGCCA TCAGTTCCTT TCCTGAAGAA 4440 

GAATTTGATC TGGCTGGAAG GAAACGATTT GTTGCTCCTT ACGTCAOGTA CCTAAATAAA 4500 

GAOCCATCAG CCCCST G CTC TCTGACTGAT GCACTGGATC ACTTCCAAGT GGACAGGCTG 4560 

GATGAAATCA TCCCCAATGA CCTGAAGAAG AGTGATCTGC CTCCCCAGCA TGCTCCCCGC 4620 

AACATCACCX3 TGGTGGCCGT GGAAGGTTGC CACTCATTTG TCATTGTGGA TTGGGACAAA 4680 

GCCACXTCCAG GAGATTTGGT CACAGGTTAT TTGGTTTACA GTGCATCCTA TGAAGATTTC 4740 

ATCAGGAACA AGTTTTCCAC TCAAGCTTCA TCAGTAACTC ACTTGCCCAT TGAGAACCTA 4800 

AA6CCCAACA OGAGGTATTA TTTTAAAGTG CAAGCACAAA ATCCTCATGG CTAOGGACCT 4860 

ATCAGCCCTT OGGTCTCATT TGTCACOSAA TCAGATAATC CTCTGC3TCT TGTGAGGCCC 4920 

CCAGGCGGTG AGCTATCTGG ATCCCATTCG CTTTCAAACA TGATCCCAGC TACAOSGACT 4980 

GCCATGGAOG GCAATATGTG AAGCGCACGT GGTATOGAAA GTTOGTGGGA GTTGTTCTTT 5040 

GTAATTCACT GAGGTATAAA ATCTACCTCA GTGACAAOCT GAAAGATACA TTCTACAGCA 5100 

TTGGAGACAG CTGGGGAAGA GGTGAAGACC ATTGCCAATT TGTGGATTCA CACCTTGATG 5160 

GAAGAACAGG GCCTCAGTCC TATGTAGAAG CCCTCCCTAC TATTCAAGGC TACTATCGCC 5220 

AGTATCX3TCA GGAGCCTGTC AGGTTTGGGA ACATOGGCTT OGGAAOCCCC TACTACTATG 5280 

TGG6CTGGTA OGAGTGTGGO GTCTCCATCC CTGGAAAGTG GTAATGACAG GACCGTCATG 5340 

CTGCAAGCTT GCCCTGCCCA GCCCCACCAA CTAAGTOGCA CTAGGGGCTG TGAGCAAAGA 5400 

CAGCCAGCAT GCTCAGCCCC GCTGCCCTAG GTGCCAGGAA GGTCACAGAT GGACACTGGC 5460 

CATTCTGGTC ATCTCAGTCT GGAACTCAGT CCCACTTCTT GGCCTGGACA ATGAACAGGA 5520 

TTCAGTTTTG CTGTTAACTT TGCTTCTCTA CrrrmTm TTTGTTTGTA ATAGCACATC 5580 

GCAGAGACAT CAGAAACCAG CAACTGATTC AGTGTGATTT CCCAGACTTT TTAGGCATGA 5640 

AATTCGGACA CTPCAGTATT TCCAGGAATA GCATATGCAC GCTGTTCTTG CTTCATGGAA 5700 

'TGCTACATGC TTTCTGTTTT TCTCATTTTG GATTTCTCCA AAACTAACTC AATTTAAGCT 5760 

TCAGGTCCCT TTGTATGCAG TAGAAAGGAA TTATTAAAAA GAGCAGCAAA GAAAATAAAT 5820 

ATATCCTACT TGAAATTTAC TCTATGGACT TACOCACTGC TAGAATAAAT GTATCAAATC 5880 

TTATTTGTAA ATTCTCAATT TTGATATATA TATGTATATA TGCATATACA TATCCACACT 5940 

TGTCTGCAAG AATATTGATT AAAATTGCTA AATTTGTACT TGTTCAGCAA AAAAAAAAAA 6000 
AAAAAAA 

Seq 10 NO: 419 Protein sequence 
Protein Accession ft: Eos sequence 

1 11 21 31 41 51 

1 I I I I I 

HPGTKLTRTG APADYRVIUC TSQEDBLDVP DDISVRVMSS QSVLVSWVDP VLEKQKKWA 60 

SRQYTVRYRE KGELARWDYK QIAHRRVLIB NLIPDTVYEP AVRISQGEBD GKWSTSVFQR 120 

TPBSAPTTAP EaiLNVWPVNG KPTWAASHO ALPBTESRVK VCLLDTGIjFS VSSFQPSAKS 180 

FQNTPFHTPR LSNHLEQSPS PILETLLLPW WMVCSLGNAI PSKSGPQTGE AWDLTPKPSL 240 

SLCQQECSCT QKDFSCLAYL IDIQTKQVNK DPQLEGSVFG PCFLPYPLTF MLDIGGPSFI 300 

MCYEDPVSSI/ TGNSLKSVAA SKADVQONTB DNGKPEKPEP SSPSPRAPAS SQHPSVPASP 360 

QGRNAKDLLIj DLKNKILAKG GAPRKPQLRA KKAEELDLQS TEIT6EEELG SREDSPMSPS 420 

DTQDQKRTLR PPSRBGESW APGRTAVRAR MPALPRRBGV DXPGPSXATQ PRPGAPPSAS 480 

ASPAHHASTQ GTSHRPSLPA SLNDMDLVDS DEDERAVGSL HFKGAFAQPR PALSPSRQSP 540 

SSVLRDRSSV HPGAKPASPA RRTPHSGAAE EDSSASAPPS RLSPPHGGSS RLLPTQPHLS 600 

SPLSKGGKDG EDAPATNSNA PSRSTMSSSV SSHLSSRTQV SEGAEASDGE SHGDGDREDG 660 

GRQAEATAQT LRARPASGHF HLLRHKPFAA NGRSPSRFSI GRGPRLQPSS SPQSTVPSRA 720 

HPRVPSHSDS HPKLSSGIHG DEEDEKPLPA TWNDHVPSS SRQPISRGWE DLRRSPQRGA 780 

SLHRKEPIPE NPKSTGADTH PQGKYSSLAS KAQDVQQSTD ADTEGHSPKA QPGSTDRHAS 840 

PARPPAARSQ QHPSVPRRMT PGRAPBQQPP PPVATSQKHP GPQSRDAGRS PSQPRLSLTQ 900 

AGRPRPTSQG RSHSSSDPYT ASSRGMIiPTA LQNQDS3AQG SYnDDSTEVE AQOVRAPAHA 960 
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ARAKEAAASL PKHQQVBSPT GAGAGGDHRS QRQIAASPAR PSRPGGPQSR ARVPSRAAPG 1020 

KSEPPSICRPL SSKSQOSVSA EDEEEEDAGF FKGGKSILLS SSOTKWPSSS TPRGGKDADG lOBO 

SIAKSKREPA lALAPRGGSL APVKRPLPPP PGSSPRASHV PSRPPPRSAA TVSPVAGTHP 1140 

HPRYTTBAPP GEFSTTPMI*S LRQRMMHARF HNPLSHQPAR PSYRQGYNGR PNVEGKVLPG 1200 

SKGKPllGQRI INGPQGTKWV VDIiDRGLVUf AEX»YLQDSH GNFXiRIKLGG OGRTI VOLEG 1260 

TFWSPDGLP LPGQGRHGTP LANAQOKPIL SLGGKPLVGL EVZKKTTKPP TTTMQ?TTTT 1320 

TPLPTTTTPa PTTATTrtQPT TTTTPLPTTT PRPTTATTRR TTTRRPTTTV RTTTRTTTTT 1380 

TPKPTTPIPT CPPGTLBRHD DDGNLIMSSN GIPECYAEED EFSGLETTJTA VPTEEAYVIY 1440 

DSDYEFETSa PPTTTEPSTT ATTPRVIPES GAISSPPEEE FDLAGRKRFV APyVTYUJIO) ISOO 

PSAPCSLTDA LDHFQVDSLD EIIPNDLKKS DLPPQHAPaN ITWAVEGCH SFVIVDWDXA 1S60 

TPGDLVTGYL VYSASYS3FI RNKFSTQASS VTHU>IEIILK PNTRYYFKVQ AQNPHGYGPI 1620 
SPSVSFVTES DNPLLWRPP GGELSGSBSL SMMIPATRTA MDGNM 



Seq ID NO: 420 WIA sequence 

Nucleic Acid Accession # : NM_022743 

Coding sequence: 128.. 1237 

1 11 21 31 41 51 

I I i 1 i I 

GTGGATTTTA GAGATACCTC CCCTCCTTCT GCTCAGCTGC CTTGCAGTAA TTAAACTCTT 60 

TCTCTGCTCC AACACCCCTA CTGTTCTCCG TGTATTGGCT TTTCTGGGCA GCAGGAAGGA 120 

AAAGCTGATC CGATGCTCTC AGTGCCX3CGT CGCCAAATAC TGTAGTGCTA AGTGTCAGAA 180 

AAAAGCTTGG CCAGACCACA AGOGGGAATG CAAATGCCTT AAAAGCTGCA AACCCAGATA 240 

TCCTCCftGAC TCOGTTOGAC TTCrTGGCAC AGTTGTCTTC AAACTTATGG ATGGAGCACC 300 

TTCAGAATCA GAGAAGCTTT ACTCATTTTA TGATCTGGAG TCAAATATTA ACAAACTGAC 360 

TQAAGATAAG AAAGAGGGCC TCAGGCAACT CGTAATGACA TTTCAACATT T CATG AGAGA 420 

AGAAATACAG GATGCCTCTC AGCTGCCACC TGCCTTTGAC CTTTTTGAAG CCTTT GCAAA 460 

AGTGATCTGC AACTCTTTCA CXaVTCTGTAA' TGCGGAGATG CAGGAAGTTG GTGTTGGCCT 540 

ATATCCCAGT ATCTCTTTGC TCAATCACAG CTGTGACXXX: AACTGTTCGA TTGTGTTCAA 600 

XGGGCCCCAC CTCTTACTGC GAGCAGTCCG AGACATGGAG GTGGGAGAGG AGCTCACCAT 660 

CTGCTACCTG GATATGCTGA TGACCAGTGA G6AGCGC0GG AAGCAGCTGA GGGACCAGTA 720 

CTGCTTTGAA TGTGACTGTT TOOGTTGCCA AACCCAGGAC AACGATGCTG ATATGCTAAC 780 

TGGTGATGAG CAAGTATGGA AGGAAGTTCA AGAATCCCTG AAAAAAATTG AAGAACTGAA 840 

GGCACACTGG AAGTCGGAGC AGGTTCTGGC CATGTGCCAG GCGATCATAA GCAGCAATTC 900 

TGAACGGCTT CCCGATATCA ACATCTACCA GCTGAAGGTG CTCGACTGCX3 CCATGGATGC 960 

CTGCATCAAC CTCGGCCTGT TGGAGGAAGC CTTGTTCTAT GGTACTCXjGA CCATGGAGCC 1020 

ATACAGGATT TTTTTCCCAG GAAGCCATOC CGTCAGAGGG GTTCAAGTGA TGAAAGTTGG 1080 

CAAACTGCAG CTACATCAAG GCATGTTTCC CCAAGCAATG AAGAATCTGA GACTGGCTTT 1140 

TCATATTATG AGAGTGACAC ATGGCAGAGA ACACAGOCTG ATTGAAGATT TGATTCTACT 1200 

TTTAGAAGAA TGCGACGCCA ACATCAGAGC ATCCTAAGGG AACGCAGTCA GAGGGAAATA 1260 

CX3GCGTGTGT CTTTGTTGAA TGCCTTATTG AGGTCACACA CTCTATGCTT TGTTAGCTGT 1320 

GTGAACCTCT CTTATTGGAA ATTCTGTTCC GTGTTTGTGT AGGTAAATAA AGGCAGACAT 1380 

GGTTTGCAAA CCACAAGAAT CATTAGTTGT AGAGAAGCAC 6ATTATAATA AATTCAAAAC 1440 
ATrTGGTTGA GGATGCCAAA AAAAAAAAAA AAAAAAA 



Seq ID NO: 421 Protein sequence 
Protein Accession #: NP_073560 

X 11 21 31 41 SI 

I I i I ] I 

HRCSQCRVAK YCSAXCQKKA WPDHKRECKC LKSCKPRYPP DSVRLIiGRW FKLMDGAPSE 60 
SEKLYSPYDL ESNINKLTED KKEGLRQLV« TFQHFMREEI QDASQLPPAF DLPEAFAKVI 120 
OISFTICNAE MQBVGVGLYP SISLLNHSCD PKCSIVFMGP HLLLRAVRDI EVGEEXiTICY 180 
LDMLMTSEER RKQIiROQYCF ECDCFRGQTQ DKDADMLTGD EQVWKEVQBS LKKIEBLKAH 240 
WKWEQVLAMC QAIISSNSER IiPDINIYQUC VU3CAMDACI NZ#GLLEEAtf YGTRTMEPYR 300 
IFFPGSHPVR GVQVMKVOKL QLHQGMPPQA MKNLRLAPDI MRVTKGREHS LIEDLUiLLE 360 
ECDANIRAS 



Seq ID KO: 422 DNA sequence 

Nucleic Acid Accession tf: NM_003014.2 

coding sequence: 238.. 648 



1 11 21 

I I I 

GGCGGGTTOG OGCOCOGAAG GCTGAGAGCT 
CGGAGCTCCXJ OGGCCGGACC CCGCGGCCCC 
AAACTCTCCT GCGCCCCAGA AGATTTCTTC 
GGCAGGAAGA GAAGGCGCTT TCTGTCTGCC 
TTCCTCTCCA TCCTAGTGGC GCTGTGCCTG 
GCGCCCTGOG AGGCGGTGCG CATCCCTATG 
ATGCCCAACC ACCTQGACCA CAGCAGGCAG 
GAGGAGCTGG TGGACGTGAA CTGCAGCGCC 
GCGCCCATTT GCACCCTGGA GTTCCTGCAC 
CAACGOGCGC GCGACGACTG CGAGCCCCTC 
AGCCTGGCCT GCGACX5AGCT GCCTGTCTAT 
ATGGTCAGGG ACCTCCCGGA GGATGTTAAG 
CAGGAAAGGC CTCTTGATGT TGACTGTAAA 
AAGGTGAAGC CAACTTTGGC AACGTATCTC 
AAAATAAAAG CTGTGCAGAG GAGTGGCTGC 
GAGATCTTCA AGTCCTCATC ACCCATCCCT 
TCTTGCCAGT GTCCACACAT CCTGCCCCAT 
CGTTCAAGGA TGATGCTTCT TGAAAATTGC 
AAAAGATCCA TACAGTGGGA AGAGAGGCTG 
AAGAAAACAG CCGGGCGCAC CAGTCGTAGT 
GCTGCCAAAC CAGCCAGTCC CAAGAAGAAC 
.AACCCGAAAA GAGTGTGAGC TAACTAGTTT 
'gatgaggctg GGCATTGCCT GGGAGAGCCT 



31 41 51 

i i ) 

GGCGCTGCTC GTGCCCTCTG TGCCAGAOGG 60 

GCTTTGCTGC CGACTGGAGT TTGGGGGAAG 120 

CTCGGCGAAG GGAGAGCGAA AGATGAGGGT 180 

GGGGTOGCAG CGCGAGAGGG CAGTGCCATG 240 

TGGCTGCACC TGGOGCTGGG CGTGCGCGGC 300 

TGCCGGCACA TGCCCTGGAA CATCACGCGG 360 

GAGAAGGCCA TCCTGGCCAT OGAGCAGTAC 420 

GTCCTGCGCT TCTTCTTCTG TGCCATGTAC 480 

GACCCTATCA AGCCGTGCAA GTCGGTGTGC 540 

ATGAAGATGT ACAACCACAG CTGGCCCGAA 600 

GACCGTGGCG TGTGCATTTC GCCTGAAGCC 660 

TGGATAGACA TCACACCAGA CATGATGGTA 720 

CGCCTAAGCC CCGATCGGTG CAAGTGTAAA 780 

AGCAAAAACr ACAGCTATGT TATTCATGOC 840 

AATGAGGTCA CAACGGTGGT GGATGTAAAA 900 

CGAACTCAAG TCCOGCTCAT TACAAATTCT 960 

CAAGATGTTC TCATCATGTG TTAOy^STGG 1020 

TTAGTTGAAA AATGGAGAGA TCAGCTTAGT 1080 

CAGGAACAGC GGAGAACAGT TCAGGACAAG 1140 

AATCCCCCCA AACCAAAGGG AAA6CCTCCT 1200 

ATTAAAACTA GGAGTGCCCA GAAGAGAACA 1260 

CCAAAGC6GA GACTTCCGAC TTCCTTACAG 1320 

ATGTAAGGCC ATGTGOCCCT TGCCCTAACA 1380 
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AcrcACTCCA crrGcrcrrcA tagacacatc ttgcagcrtt 

t.rnii ' ClH GTAAGCCATC ACAAGCCATA GTGGTAGGTT 
GASTTAAAGC TGGTGGAAAA GGCTTATTGC ATTGCATTCA 
CTW3\AGAGT AGGGAAAATA ATGCTTGTTA CAATTCGACC 
AAATGCCATA TTTCAAACAA AAC ACGTAAT TTTTmCftG 
TATCTGTTGT TGCAATGTTA GTGATGTTTT AAAATGTGAT 
AAGGAACAGT AGTGGAATGA ATGTTAAAAG ATCTTTATGT 
TmrciGAT GAAAGGGGAT TTTTTGAAAA ATTAGAGAAG 

ivivrm 'rr taccaatgac ttcagtttct gti'ii iagct 

AATAATAAAG AAAAATAAAT AAAAAGGACA GGCAGA CAAT 
GTTACCPGAT TTCCATGATC ATGATGCTTC TTGTCAACAC 
ACAGTGAGTT TGTCTGTACC ATTAGGAGTT AGGTACTAAT 
ATTTTATACC CACAAGAGAG GTATGTCACT CATCTTACTT 
AATAATTTGA CAAGCTTAAA AATGGCCTTC ATGTGAGTGC 
TTAAATATTT TCTTTGCCTA AATACATGTG AG AGGA GTTA 
AAAGTTGAGT TCCACCTCTG AAATGAGAAT TACTTGACAG 
AAAAAGAACT TATTTGCAGC ATTTTATCAA CAAAT TTCAT 
ATTTATTTTA AAAAACAATT TTATTGGCCT TTTGCTAACA 
AGGCATTCAA TAAATGCACA ACGCCCAAAG GAAATAAAAT 
ACTACACAGA GGTAATCACT ATTAGTATTT TXX3CATATTA 
GCACTTATAA AATGATTTGA ACAAATAAAA CTA GGAAC CT 
CTULXrrUCTT TGCTTGGCCC TTTATTGAGA TAAGTTTrCC 
TCTCATTTCT AACAGCTGTG TTATATTCCA TAGTATGCAT 
TATTGOATAC TTAGGTGGTT TCTTCACTGA GAATACTGAA 

seq ID NO: 423 Protein sequence 
Protein Accession S: NP_003005.l. 



PCTAJS02/12476 



TTTCTTAAGG 
TGCCCTTTGG 
GAGTAACCTG 
TAATATGTGC 
TAIGTTTTAT 
GAAAATATAA 
GTTTATGGTC 
TAGCATATGG 
AGAAACTTAA 
GTCTGGATTC 
CCTCTTAAGC 
TAGTTGGCTA 
CCCAGGACAT 
CAAATTTTGT 
AATATAAATG 
TTGGGATACT 
AATTGTGGAC 
CAGTAAGCAT 
CCTATCTAAT 
TTCTCCAGGT 
GTATACATGT 
TGTCAAGAAA 
TACTCAACAA 
TAAACATCTC 



CTATGCTTCA 
TACACAAGGT 
TGTGCATACT 
ArrCTAAAAT 
TACCTTTTGA 
TGTTTTTAAG 
TGCAGAAGGA 
AAAATTATAA 
AAACAAAAAT 

AGCACCAGAA 
ATGCTCAAGT 
CCACCCTGAG 
TTTTCTTCAT 
TACAGAGAGG 
TTAATCAGAA 
AATTGGAGGC 
GTATTTTATA 
CCTACTCTCC 
GTTTGCTTAT 
GTTTCATAAC 
GCAGAAACCA 
ACTGTTGTGC 
ACOGGAATTC 



30 I 



MPLSILVALC 
YEELVDVNCS 
ESLACDELPV 
KKVKPTLATY 
SSCQCPaZLP 
KKKTAGSTSR 



11 

I 

LWLHLALGVR 
AVLRFPFCAM 
YDRGVCISPE 
IjSKKYSYVIH 
HQDVLIMCYB 
SNPPKPKGKP 



21 31 

I I 

GAPCEAVRIP KC3iHMPMNIT 
YAPICTLBFL HDPIKPCKSV 
AIVTDLPEa5V KWIDITPOMM 
AKIKAVQRSG CNEVTTWDV 
VRSnmid^ ChVSSOiSDQh 
PAPKPASPKK NIKTRSAQKR 



Seq ID NO: 424 DNA sequence 
tlucleic Acid Accession ft: BC010423 
Coding sequence: 248.. 1780 



CACAGCGTGG 
AGCTACGGCT 
CAAGTGCGAG 
TCTGCAGCCG 
TTCAACCATG 
GCTGCTACTG 
CGTGGTAACT 
CGGCGAGCAA 
ACTAGCGCTA 
GGAGCAGCCG 
GCAGGCGGAT 
GGCGCGGCTG 
ACTAGAAGAG 
CCCCAGCGTG 
CTCCOGCTCT 
GCAGCCACTG 
CATCCTCCAC 
GTGGCACATT 
CrCATACAAC 
CACTTTGGGC 
CAATGAGTTC 
CTCTGGGAAG 
ACTCTTGTTC 
GGCCCAGCAG 
COGGAGGCTG 
GAGAGCCGAG 
AGAGCCCGA6 
TGAACTGCTG 
CAAACAGGCC 
CAATGGCATC 
CTAGGCCTGG 
ACACCCCCAT 
AACCCTTCTG 
CACTGTGTGT 
TGACTGTCCG 
AAGTGAACTG 
GTTTGGCGTG 
CAGACOCCAG 
CAGACCCAGG 
TCTCCTACCA 
GAGGCTTGAA 
ACATATTTTC 
ACTTTTAATT 
TTTTATTTTT 



11 
I 

GAAGCAGCTC 
GGGTGTGTAG 
AGGCAAGAAC 
GCTCCCAGGG 
CCCCTGTCCC 
CTGGCATCAT 
GTGGTGCTGG 
GTGGGGCAAG 
CTGCACTCCA 
CX3GCCCCCAC 
GAGGGCGAGT 
GGGCTOCGAO 
GGCCAGGGCC 
ACCTGGGACA 
GCTGCCGTCA 
ACrrGTGTGG 
GTGTCCTTCC 
GGCAGAGAAG 
TGGACACGGC 
TTTCCCCCAC 
TCCTCAAGGG 
CAGGTGGACC 
TGCCTTCTGG 
ATGACCCAGA 
CATTCCCATC 
GGCCACCCTG 
GGCGGCAGTT 
TCTCCAGGCT 
ATGAACCATT 
TACATCAATG 
CTCCTTCTGT 
TTCTTGCGGA 
TTCATCGGGA 
GTGCATGTGT 
TGGAGGGGTG 
TGGTGTATGT 
TGTGTCATGT 
AGCAGTATTA 
TGTGCGGGCA 
CTTCGGAGCC 
CTGTTACAGA 
TGTAAATATA 
TTTTTCTTTT 
ATTTTTTTTT 



21 
I 

TGGGG6AGCT 
AACGGGGCCG 
TCTGCAGCTT 
AGATCTOGGT 
TGGGAGCCGA 
TTACAGGCCG 
6CCAGGAOGC 
TGGCATGGGC 
AATACGGGCT 
GCAACCCCCT 
AGGAGTGCGG 
TGCTGGTGCC 
TGACCCTGGC 
CX3GAGGTCAA 
CCTCAGAGTT 
TGTCCCATCC 
TTGCTGAGGC 
GAGCTATGCT 
TGGATGGGCC 
TGACCACTGA 
ATTCTCAGGT 
TAGTGTCAGC 
TGGTGGTGGT 
AATATGAGQA 
ACAOGGACQC 
ATAGTCTCAA 
ACTCCACGCT 
CTGGGCGGGC 
TTGTTCAGGA 
GGOGGGGACA 
TGACATGGGA 
AGATGCTCCC 
GG6CTCCA0C 
GCCTGTGTGA 
ACTGTGTCCX; 
GCCACGGGAT 
GGCTGTGTGT 
ATGATGCAGA 
TAGCTGGAGC 
ATGGGGGCAA 
AGCCCTCTGC 
CATGCGCOGG 
TTTTTTCTTG 
AGAGTTTGAG 



31 
I 

CGGAGCTCCC 
GGGCTGGGGC 
CCTGCCTTCT 
GGAACTTCAG 
GATGTGGGGG 
GTGCCCCGCG 
AAAACTGCCC 
TCGGGTGGAC 
TCATGTGAGC 
GGACGGCTCA 
GGTCAGCACC 
TCCCXTTGCCX 
AGCCTCCTGC 
AGGCACAACG 
CCACTTGGTG 
TGGCCTGCTC 
CTCtGTGAGG 
CAAGTGCCTG 
TCTGCCCAGT 
GCACAGCGGC 
CACTGTCGAT 
CTOGGTGGTG 
GGTGCTCATG 
GGAGCTGACC 
GAGGAGCCAG 
OGACAACAGT 
GACCAOGGTG 
06AGGAGGAG 
GAATGGGACC 
CCTGGTCTGA 
GATTTTAGCT 
CATCCCACTG 
AATT6AGTCT 
GTGTTGACTG 
TGGTGTGTAT 
TTGAGTGGTT 
GACCTCTGCC 
GGTTGGAGGA 
TGGAATCTGC 
GTGTGAAGCA 
GCTCTGGrOG 
GAGCTTCTTG 
CCCTTTCCAT 
TCCAGCCIGG 



41 
I 

GATCACGGCT 
TGGGTCCCXT 
GGGTCAGTTC 
AAACGCTGGG 
CCTGAGGCCT 
GGTGAGCTGG 
TGCTTCTACC 
GCGGGCGAAG 
CCGGCTTAOS 
GTGCTCCTGC 
TTCXXXGCOG 
TCACTGAATC 
ACAGCTGAGG 
TCCAGCCGTT 
CCTAQCCGCA 
CAGGACCAAA 
GGCXTTTGAAG 
AGTGAAGGGC 
GGGGTACGAG 
ATCTAOGTCT 
GTTCTTGACC 
GTGGTGGGTG 
TCCOGATACC 
CTGAOCAGGG 
CCGGAGGAGA 
AGCTGCTCTC 
AGGGAGATAG 
GAAGATCAGG 
CTAOGGGCCA 
CCCAGGCCTG 
CATCTTGGGG 
ACTGCTTGAC 
CTCCCACCAT 
ACTGTGTGTG 
TATGCTGTCA 
GOGTGGGCAA 
TGAAAAAGCA 
GAGAGGTGGA 
CTCCGGTGTG 
GCCAGTCCCT 
CCTCTGGGCC 
CAGGAATACT 
TAGTTGTATT 
A06ATATAGC 



1440 
ISOO 
IS60 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 



41 51 

I I 

RKPMHLHHST QENAILAIBQ 
CQRARODCEP LMKKYNHSWP 
VQERPU3VDC KRXjSTORCKC 
KEIFKSSSPX PRTQVPLITN 
SKRSIQHEER 2X2GQRRTVQD 
TNPKRV 



51 
I 

TCTTGGGGGT 
AGTGGAGACC 
CTTATTCAAG 
CAGTCTGCCT 
GGCTGCTGCT 
AGACCTCAGA 
GAGGGGACTC 
GCGCCCAGGA 
AGGGCCGCGT 
GCAACGCAGT 
GCAGCTTCCA 
CTGGTCCAGC 
GCAGCCCAGC 
CCTTCAAGCA 
GCATGAATG6 
GGATCACCCA 
ACCAAAATCT 
AGCCCCCTCC 
TGGATGGGGA 
GCCAT6TCAG 
CCCAGGAAGA 
TGATCGCCGC 
ATCGGCGCAA 
AGAACTCCAT 
GTGTAGGGCT 
TGATGAGTGA 
AAACACAGAC 
ATGAAG6CAT 
AGCCCAOGGG 
CCTCCCTTCC 
CCCTCCTTAA 
CTTTACCTCC 
GCATGCAGGT 
1GTGGAGGGG 
TATCAGAGTC 
CACTGTCAGG 
GGTATTTTCT 
GACTGTGGCT 
AGGGAACCTG 
GGGTCAGCCA 
7GCFGCATGT 
GCTC OGAATC 
TTTTATTTAT 
CAGACCCTGT 



60 
120 
180 
240 
300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620' 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2260 
2340 
2400 
2460 
2520 
2580 
2640 



I 

344 



VfO 02/086443 

CTGTAAAAM AOCAAAAOOC AAAAAAAAM AAAAAAAAAA 



PCTAJS02/12476 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID KOi 425 Protein sequence 
protein Accession ft: AAH10423 



1 
I 

mhSUSKEKX 
QVGQVAKARV 
DSGBYECEVS 
VTWDTEVKGT 
HVSFLAKASV 
6FPPLTTEHS 
FCUiWWVL 



11 
I 

GPEAWLLUiL 
DAGEGAQBLA 
TFPAGSFQAS 
TSSRSPKKSR 
RGLEDQNIiHH 
GIYVCHVSNE 
MSRYKRRKAQ 



AKMBFVQEIIG TUtAXPTGNG 



21 
I 

LLASFTGRCP 
LLRSKYGLHV 
LRLRVLVPPIi 
SAAVTSEFHL 
IGREGAMIiKC 
FSSRDSQVTV 
QKTQKYEBEL 
EGRSYSTLTT 
lyiKGRGHLV 



31 
1 

AGELETSDW 
SPAYSGRVEQ 
PSLNPGPALS 



tiSBGQPPPSY 
DVU3PQEDSG 
TLTRENSIHR 
VREIETQTEL 



41 

1 

TWLGffllAKL 
PPPPRNPUX; 
SGQQLTlAfiS 
LTCWSHPGL 
NHTRU3GPLP 
KQVDLVSASV 
LHSHBTDPRS 
LSPGSGSAEB 



SI 
I 

PCFYRCTSGE 
SVLLRKAVQA 
CXAEGSPAPS 
LODQAITHZL 
SCVRVDCSTTL 
VWGVIAALL 
QPSESVGlliRA 
EEDQDBGIKQ 



Seq ID NO: 426 DHA sequence 

Nucleic Acid Accession 8: Nll_003474.2 

coding sequence; 37.. 3036 



1 

I 

CACTAA(XCT 
TCAAGGCTGG 
CTTTTTTAAA 
CCGGAGCT6A 
GCCCGCX3TGG 
GCGAOGATGG 
GCCGGTGCTC 
GCTGATGAAG 
TTCGACTCCA 
CT6ATCATAA 
TATCTGCAAG 
TGTTACTACC 
TCTGGTCTCA 
AGTGCAACCA 
TGTGGATCAC 
CAGACATGGG 
GTGATCGTGG 
CAGC GATTA A 
ATCGT6TTGG 
CCATTCACCA 
TCCCATGACA 
GCCXXAATCA 
GACAATCCCC 
AATCATGACA 
ATCATGAACG 
GACTTGGAGA 
AGGGAGTCTT 
GACTGTGGGG 
AAGCCGGACG 
GGAACAGCGT 
AGCCCTCACT 
GGCTACTGCT 
CCAGGTGCTA 
TATGGCAACT 
AAATGTGGAA 
GTTTCCATAG 
CAOOTGTACT 
GCAGATGGAA 
GAGTGTGCAA 
GAGGCCCACT 
GGCCCCATCC 
TGTCTTCTTG 
TTTAOVAATA 
CGTGGCTTCC 
COGCCAGATT 
GACATCAGCA 
CTTCCTCCCC 
AAGCCTGCAC 
CCTGCAGATC 
CAATGGGAGA 
GTGCCCAGAT 
TGAAGACAGA 
GGATTTTTTT 
GCTGTGCTGT 
GCAGAATGTT 
CCATGGCAGG 
ATGGGATTCT 
CCAACTAOCC 
CfCAGTTGAT 
TGTGTTTGGC 

acacx:tggga 
aggaatctta 
tgaooctgag 

GGTGCTTGAT 



11 



CTTCCTAGTC 

AATGAAAGGC 
CTCGCCGAGG 
GATGGTGCAG 
CAGGGCGCCC 
TGCTOGCGCC 
TTGTCAGTGC 
AGAATCATCX 
ATCTGGAAAG 
AC6GTACTGA 
ATGGACATGT 
GGGGACTTAT 
ACAGATACAA 
ATCACAACAC 
CAAGAAGGCA 
CAGACAACCG 
TACAGATTGC 
TAGGCGTGGA 
GCCTCCATGA 
ATGCGCAGCT 
TGAGCATGTG 
TTGGTGCAGC 
CACTGGACAG 
CTTCCACCGG 
CCAGCCTGGA 
TCGGGGGCCA 
AGCCAGAGGA 
CTGTGTGCGC 
GCAGGGACTC 
GCOCAGCCAA 
ACAATGGCAT 
AACCTGCCCC 
GTGGCAAAGT 
AAATCCAGTG 
AAACAAACAT 
TGGGOGATGA 
AAATCTGCCT 
TGCAGTGCCA 
GGGCACCTCC 
GGCAAGCAGA 
CTGCCXXSATT 
AGAAGACCAC 
AACCCTGTCA 
CCTACCCACC 
GACCCCTCAA 
TCCAOCGGGC 
TTAGGCAGGC 
CTCTGGCCAG 
CTGGGCTCCX5 
C CACCC ACAC 
AGTTTGCACT 
TAATGTTTAA 
GCTATGGTGC 
GATTACAGTG 
AAGGCTTGTT 
GGACAGGATG 
GCAGCTGTGC 
TTTCTGGATT 
TTTCAGGGAG 
GAAATCTGGC 
AGGTGTAGCC 
CTGACCA6CC 
AGAAAT6CCA 



21 

1 

CCCGGGCCAA 
AACGGGG06C 
TAGAAGAGCT 
CAGGAAATCC 
GGCTCGCCGC 
GCTGCCGGTO 
CTGCGAGGCC 
CTCTGTTOGG 
AGAAGTGCTG 
AAATGAAGGT 
tGTCTCCCTC 
AC6GGGATAT 
TGTGTTTGAA 
ACTCTTCCCA 
ACCAAACCTC 
TAAAAGAGAG 
AGAGTTTCAG 
TAATCAOGTT 
AGTGTGGAAT 
ATTTCTGGAC 
TGTCAGTGGG 
CACGGCAGAC 
CGTGACCCTG 
GGGCTGTAGC 
GTACCCATTT 
GAAAG GAATG 
GAAGTGItjGG 
ATGTATGAAT 
ACATGGGCTG 
CAGCAACTCC 
CGTGTACCTG 
CTGCCAGACT 
TGGGATCTGC 
CTCGAAGACT 
TCAAGGAGGT 
CCCCCTGCAG 
CATGCCGGAC 
GAATCGTCAA 
CGGCAGAGGG 
CTTCTGTGftC 
TAACCAAGGT 
TGTGGTTTAT 
CATTGAAAAA 
GGCTCACCTC 
GAAGGACAAT 
CGGCCTGAAT 
CCCACGTGCA 
CCAGGGGACC 
AACAACTCGG 
CCTGGCACCC 
CGCCTATATT 
ATCTTTCAGC 
AACATCATTA 
TCTGTCTACT 
CAGTGCGCTG 
GTGCrTTTAG 
TGTTTGCTTT 
TTATGGTACC 
COOCATCTCA 
GCCCTGTGCC 
TTCTGGCCAG 
ACACCAGGAT 
GTGAGCATGT 
AGCACTTCTT 



31 
I 

CTGGGACAGT 
GCX»:GA06CA 
CAGCGGCOGC 
CTCCGGTCGC 
CGGGCCOGAG 
TCCCCOGCCC 
CQAGGGGTGA 
AGTGGGGACC 
AATATTCGAC 
CTCATTGCCA 
GCTCGAAATT 
TCTGATTCAG 
AATGAAAGCT 
GGGAAGAAGC 
GCTGCAAAGA 
ACCCTCAAGG 
AGGCAAGGAA 
GACAAGTTTT 
GACATGGACA 
TGGAGGAAGA 
GTTTATTTCC 
CAGTCTGGGG 
GCACATGACC 
TGTCAAATGG 
CCCATGGTGT 
GGGGTGTGCC 
AACAGATTT6 
OGCTGCTGCA 
TGCTGTGAAG 
TGTGACCTCC 
CACGATGGGC 
CACGAGCAGC 
TTTGAGAGAC 
TCCTTTGCCA 
GCCAGCCGGC 
CAAGGAGGCC 
CCAGGGCTTG 
TGTCAAAATA 
GTGTGCAACA 
AAGTTTGGCT 
TTAACCATAG 
CTCAAAAGGA 
CTAAGGTGTG 
GGCCACCTTG 
CCCAGGAGAT 
GTCCCrCAGC 
CCTAGCGTCC 
TGTAAGCCAA 
CTCACTCATG 
CTCAGACCTG 
AAGTGAGAAG 
TCCAGTTGGA 
CTATAAGAAC 
TGCACA6GTA 
TAGTAGGCAT 
TATTTTAGTG 
CTGATCAAGG 
AGATGCftGCT 
GGCCAGAGCC 
CCTTGACAAC 
GAAGCTTTGG 
AGAGACTGGA 
TTGGAAGGGG 
TTTCTOGCTG 



41 
I 

TTGCTCATTT 
OGCACACACA 
GCGGGCCX3TG 
GACGCCCGGC 
AGCTGCTGCA 
GOSCCCTCCT 
GCTTATG6AA 
TCTGGATCCC 
TACAACGGGA 
GCAGTTTCAC 
ACACGGTAAT 
CAGTCAGTCT 
ATGTCTTAGA 
TGAAAAGGQT 
ATGTGTTTCC 
CAACTAAGTA 
AAGATCTGGA 
ACAGACCACT 
AATGCTCTGT 
TCAAGCTTCT 
AAGGGACCAC 
GAATTGTCAT 
TGGCCCACAA 
CGGTTGAGAA 
TCAGCAGTTG 
TGTTTAACCT 
TGGAAGAAGG 
ATGCCACCAC 
ACTGCCAGCT 
CAGAGTTCTG 
ACTCATGTCA 
AGTGTGTCAC 
TCAATTCTGC 
AATGtXSAGAT 
CAGTCATTGG 
GGATTCTGTG 
TGCTTGCAGG 
TTAGTGTCTT 
ACAGGAAQAA 
TTGGAGGAAG 
GAATTCTGGT 
AGACCTTGAT 
TGCGCCCTTC 
GAAAAGGCCT 
TGCTGCAGTG 
CCCAGTCAAC 
CTGCCAGACC 
ACCCCCCTCA 
CCTTGGCCAG 
CTCCACAATA 
COGACACCTT 
GTTTTTTGTA 
TTTGAGCTAC 
CTTGTAAATT 
TTTTACCATC 
AACTTGAAAT 
CCTTATTGGA 
CAAGAGATCC 
AAGGGGCTTC 
TGGCAGGCAG 
TGAGAACCTG 
ACACTA6ACA 
TCTGTAGTGT 
TCCTTTCTAG 



51 
I 

ATTGCAAGGG 
OGGGGGGAAA 
OGOGAGGGCT 
CCCGCTCGGC 
CTGAAQGCOG 
GCTOGCCCTG 
CGAAGGAAGA 
AGTGAAGAGC 
AA6CAAAGAA 
GGAAACCCaC 
TCTGGGTCAC ■ 
CAGCACGTGT 
ACCAATGAAA 
CCGGGGATCA 
ACCACCCTCT 
TGTGGAGCTG 
AAAAGTTAAG 
GAACATTCGG 
AAGTCAGGAC 
ACCTCGCAAA 
CATOGGCATG 
GGACCATTCA 
TTTC5GGGATG 
AGGAGGCTGC 
CAGCAGGAAG 
GCCX3GAAGTC 
AGAGGAGTGT 
CTGTACCCTG 
GAAGCCTGCA 
CACAGGGG(X: 
GGATGTGGAC 
ACTCTGGGGA 
AGGTGATCCT 
GAGAGATGCT 
TACCAATGCX 
CCGGGGGACC 
CACAAAGTGT 
TGGGGTTCAC 
CTGCCACTGC 
CACAGACAGC 
GACCATCCTG 
ACGACTGCTG 
CCGGCCACCC 
GATGAGGAAG 
TCAGAATGTT 
TCAGCX^GTG 
CCTGCCAGCC 
GAAGCCTCTG 
GACCCCAGGA 
TCCACACCAA 
TTTTCAACAG 
CCAACTTTTA 
TGCGGTCAGT 
ATTAATTTAT 
ACrGAGTTTT 
ATCCTGCTTG 
AAGCAGTCCC 
CAAGTAGAAT 
AGGTCCAGGC 
GCTCCCAGGG 
G6TTGCAGAC 
AGCCAGAACT 
CACrCAACGC 
AGCACTGCCA 



60 
120 
180 
240 
300 
360 
420 
460 



60 

120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 



345 



wo 02/086443 



CCAGTAGGTT 
CTGCAAAOCG 
CAATGATCCr 
TGAAOCATTA 
AATTAGGCAG 
TATAGTTCA.T 
CATCCTCTTT 
ACCTATTTCT 
CAACTTGCTT 
CTCTTCACrC 
AATGGCATGA 
CTGGACTGGT 
TTTATGAGAA 
TGTACCAAGA 
TCOCCACTGT 
AAACACACAC 
TTATTCTATA 
AGATACATAC 
TATATACTAT 
AGATGCCCAA 
ACCAAAAAAA 



ATTTACCTTG 

CCACCTCCCT 
GTATTCAGAC 
ACXyVGATCTA 
ACTCTTTATG 
GTCTGCTATC 
TTCCAACTTG 
TAAACACTTG 
ATCAACTTCC 
TTCAAATGCC 
GAAATACAAA 
TTTCACATTA 
AGCCTTCTTT 
ATCTTGGTTT 
ATCTAGGCAA 
AAAAGGGAAC 
GTTATTAAGT 
AGAATTACTG 
TAAAAAGGTT 
ATOCTTAGAT 
AAAAAAAAAA 



GGAAAGGTOG 
ATACTCCTTG 
AGATGAGGAC 
GTCAATCAAG 
CTTGCAAAAA 
ATTATTOGTA 
GCTGCAGGAA 
CAACCTACCT 
TAAATATTAT 
TGACTAGGGA 
AATACTCAGA 
GAAGACAATT 
TGGGGTCAAC 
GCCTTCCAGA 
CATAGTATTC 
CCAGCTCTAA 
TCTTTAAAAT 
TAACTGATTA 
TACAGAATTT 
CTGGCATGTT 
AA 



TGTTTCTGTA 
GAGCTGAGCA 
TTTGCATGGG 
TCTGTTTACT 
CTACAACCAA 
GATATTGGAC 
TCTTTAAAAG 
GTTGAGCATC 
GAG AtGTGG C 
GCCATGTTTC 
TAAGGTAAAA 
GACAACAGTT 
AGTTTTCCTA 
AAACAAAACT 
ATGACTATGG 
TACATTCCAA 
GTAAAGCCAT 
CACTTGGTAA 
TATGGTGCAT 
AGCCCTTCCT 



AGAAACCTAC 
AATCACCACA 
ACCACAACTA 
GCAAGG7TCA 
TGGAATGTGA 
AAAGAACCTT 
ATGCTTTTAA 
ACAGAATGTG 
TTGGGCAGCA 
ACAAGGTCTT 
TGCCATGATG 
ACATAATTCA 
TGCTTTGAAA 
GCATTTCACT 
ATAAACTAAA 
CTOGTATAGC 
GCTGGAAAAT 
TTGTACTAAA 
TACGTGGGCA 
CCAATTATAA 



TGCCCAGGCA 
AACIGTAATA 
TTTTCAGA3G 
ACTTATTAAC 
TGTTCATGGG 
CTCTATGGGG 
CAGAGTCTGA 
ATAAGGAAAT 
TCCCCTTGAA 
TAAAGTGACT 
CCTCTGTCTT 
CTCTGAGTGT 
CAGAAAAATA 
TTCCCGGTGT 
CACGTGACAC 
ATGCATCTGT 
AATACTGCTG 
GCCAAACATA 
TTGTCTTTTT 
GAGGATATGA 



Seq ID KOi 427 Protein sequence 
Procein Accession A: NP_003465 



MAARPLPVSP 
SKKHPEVLNI 
YEGHVRGYSD 
SHHNTPNLAA 
LIEIAMKVDK 
DNAQLVSGVY 
DTLDRGCSCQ 
SFGGQKCGNR 
ACRDSSNSCD 
AKPAPGICFE 
IBTNIPLQQG 
AMQCHGRGVC 
LAAGFWYLK 
DSYPPKDNPR 
ALRQAQGTCK 
RSTHTAYIK 



11 
I 

ARAIiLLAIiAG 
RLQRESKELI 
SAVSLSTCSG 
KNVPPPPSQT 
FYRPUIIRIV 
FQGTTIGMAP 
MAVEKGGCIM 



liPEFCTGASP 
RVNSAGDPYG 
GRIIjCRGTHV 
NNRKMCHCEA 
RKTLIRIiLFT 
RLLQCONVDI 
PNPPQKPLPA 



21 
I 

ALLAPCBARG 
INLERNEGLI 
LRGLIVFENB 
WARRHKRETL 
LVGVEVWNDM 
XMSMCTAi3QS 
NASTGYPFPM 
GEPEECMNRC 
HCPANVYLHD 
NCGKVSKSSF 
YLGDOMPDPG 
HWAPPFCDKF 
NKKTTIEIOjR 
SRPLHGZiNVP 
DPLARTTRLT 



31 
I 

VSLHNEGRAD 
ASSFTETKYI* 
SYVLEPMKSA 
KATKYVELVT 
DKCSVSQDPF 
GGIVMDHSDN 



41 

I . 

EWSASVRSG 
QDGTDVSIiAR 
TNRYKLFPAX 

VADHREFQRQ 
TSliHEFIiDWR 
PLGAAVTLAH 



CNATTCTLKP 
GHSCQOVDGY 
AKCEMRDAKC 
LVLAGTKCAD 
GFGGSTDSGP 
CVRPSRPPRG 
QPQSTQRVLP 
UAliARTPGQW 



SAVCAHGLCC 
CyNGICQTHE 
GKIQCQGGAS 
GKICLNRQCQ 
IRQADNQGLT 
FQPOQAHLGH 
PLHRAPRAPS 
ETGI.RLAPLR 



3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 



51 
I 

Z3IiWIPVKSFD 
NYTVILGHCY 
XLKSVRGSOG 
GKDIiEKVKQR 
KMKLLPRKSH 
EliGHNFGMNH 
CLPNI.PEVRE 
EDCQLKPAGT 
QQCVTIiKGPG 
RPVIGTNAVS 
NISVFGVHEC 
IGILVTILCL 
LGKGLMRKPP 
VPARPIiPAKP 
PAPQYPHQVP 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



Seq ID NO: 428 DNA sequence 

Nucleic Acid Accession ft: KM_003714 

Coding sequence: 135.. 1043 



GAGGAGGAGG 
GAGGAAGAGG 
TAATACCAAG 
TGGCCACCTT 
ACAGGAGCTC 
GTTTGGTCAA 
GTGAGATTCG 
ATGCCCAGGG 
ACAGGTTOGG 
AGCGGGAATG 
TAGTGGAGAT 
ACTTGCTGCT 
AGTGTGAGCA 
AGAAGCCTCC 
GGGCCCACCA 
GAGGTGCCAA 
TCGGGGGCCT 
AGTATTCTGA 
GTCCATTTTC 
ACGCAGGATT 
TGAGATGGAG 
ACGTACTCAA 
AGTCAGT6GG 
GCAGGGCCCC 
GCAGCAGCCT 
GAATGTAAAA 
AGGGGGTGCT 
CTCTTGGCGA 
TGTCTGGGCr 
ACTGCTTCAA 
TCTAAATAAA 
TAAAAAAAAA 
TTAAAAGCTA 
CACTTGGGGG 
TTTCCCTTAG 
CTGGCGATTC 
GAAAGAAGAG 



11 
1 

GAAAAGGCGA 
AGGAGGAGGA 
AACCATGTGT 
TGACCCGGCG 
CCAGCAGAAA 
CGCTGGOGAT 
G6GCTTACAT 
CAAGTCATTC 
CTGCATAAGC 
CTACCTCAAG 
GATCCATTTC 
GACCTGTGGG 
GAACTGGGGA 
CACGGCGCCC 
CGGG6AAGCA 
GGGTGAGCGA 
TGGGGCTCAG 
TATCCGGAGG 
TTATCTATGG 
CTGTGGGGAC 
ACXXXrPGGGG 
GGGAGOGCGC 
TGTOGGCXGC 
CAGAGCTGGG 
CTGGTGCTGT 
ATAAATATCX3 
TGGTGCCAAA 
GOGTGGAGGG 
GGGGGG6ACA 
ATCTOGATTT 
TGGCTTTCAA 
AAAACCAGCC 
TCAAACAGCG 
AAACCTTATA 
GATTTCGTTA 
CAGGAGACCC 
AAT6AAGACT 



21 
I 

GCAAAAAGGA 
AGAGGGGAGC 
GCGGAGCGGC 
OGGGGGACCG 
GGCCGCCTGT 
GTGGGGTGTG 
GGGATTTGCA 
ATCAAAGACG 
CGGAAGTGCX 
GAOGACCTGT 
AAGGACTTGC 
GAGGAGGTGA 
AGCCTGTGCT 
CCCGAGCGCC 
GGACATCACC 
GGTAGCAAGA 
GGACCTTCCG 
TGAAATGAAA 
ACATTCCAAA 
TGTGGACTTC 
CCX3TGGGGTC 
COGCGTTATC 
TCTGTTGTGG 
CCACACAGTG 
CTCCGCGGAA 
CTTAGAATGC 
CTGAAATTCA 
ACGAGTGTCA 
CTGTCCAAGG 
CACTTTTTTT 
ACAAAGCAAC 
CATCCTTTGA 
ACATAGCCAT 
CCCAGAGGAA 
TCTCACCTTG 
AGCTGQAAAC 
ACTTAGTAAT 



31 
I 

AGAGTGGGAG 
ACAAAGGATC 
TGGGCCAGTT 
ACGCCACCAA 
CGCTGCAGAA 
GCGTGTTTGA 
TGACTTTTCT 
CCTTGAAATG 
CGGCCATCAG 
G0GC6GCTGC 
TGCT6CACGA 
AGGAGGCCAT 
CCATCTTGAG 
AGCCCCAGGT 
TCCCAGAGCC 
GCCACCCAAA 
GAAGCAGOGA 
GGCCTGGCCA 
ACATTTACCA 
ATCGAGGTGT 
TCAGGGGTGC 
CTOGTACCTT 
OGGAGGTGAA 
OGTGCTGGGC 
GTCAGGGGGG 
AGGAGAAGGG 
GTTTCTTGTG 
TTTCTATGTG 
GAGTGGCCCC 
ATTTATCCAG 
TGGGTCATTA 
GGCTGATTTT 
ACATCTGACT 
AATACACACC 
ACCCTCAGCC 
CTGGCTTCTC 
TCCCATCAGG 



41 

1 

GAGGAGGGGA 
CAGGTCTCCC 
CATGACCCTG 
CCCACCCGAG 
TACAGCGGAG 
ATGTTTOGAG 
GCACAACGCT 
TAAGGCXXAC 
GGAAATGGTG 
CCAGGA6AAC 
ACCCTAOGTG 
CACCCACAGC 
CTTCTGCACC 
GGACAGAACC 
CAGCAGTAGG 
CGCCCATGCC 
GTGQGAAGAC 
OCSAAATCTTT 
TTAGAGAGGG 
GTGTT0G0G6 
CTGGTGAATT 
TGTCTTCTTT 
CCAGGGAGOG 
CrOGCCCCGA 
CTGGATTCCA 
TGGAGAGGAG 
TGGGGCCTTG 
TAATTTCTGA 
TATGAGTTTA 
TTATATCTAC 
AAACCAGCTC 
TCTTTTTTTT 
GCCTGACATG 
TGGGGAGTAC 
AAGATTGGTA 
CATGTGAGGG 
AAATGCTGAC 



SI 
I 

AGOGGOGAAG 
GAOGGGAGGT 
GCTTTGGTGT 
GGTCCCCAAG 
ATCCAGCACT 
AACAACTCTT 
GGAAAATTTG 
GCTCTGCGGC 
TCCCAGTTGC 
ACCCGGGTGA 
GACCTOGTGA 
GTGCAGGTTC 
TCGGCCATCC 
AAGCTCTCCA 
GAGACTGGCC 
OGAGGCAGAG 
GAACAGTCTG 
CCTCCAOGCC 
GGGATGTCAC 
AACGGACACG 
CTGCACTTAC 
CCATCTGTGG 
GCAGGGCAAG 
AGCTTCTGGT 
G6ACAGGAGT 
GCAGGGGC06 
CGGTTCAGAG 
GCCATTGTAC 
TATTTTAACC 
ATATCTGTCA 
AAAGGGGGTT 
AAGTTCTATT 
GACT0CT6CC 
ATTTGAOUVA 
AAGCTGCGTC 
GATGGGAAAO 
CTTTTACATA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
60O 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
ISOO 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
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AAATCAAGGA GACTGCTGAA AATCTCTAAG GGACAGGATT TTCCAGATCC TAATTGGAAA 2280 
TTTAGCAA7A AGGAGAGGAG TCCAAGGGGA CAAATAAAGG CAGAGAGAGA GAGAGAGAGA 2340 
GGGAGAGGAA GAAAAGAGAG A6AGAAAAGA GOCTGGIXKrC 

Seq ID NO: 429 Procein sequence 
Protein Accession's: NP_00370S 

1 11 21 31 .41 51 

i 1 I I i i 

MCAERIiGQFM TLALVLATFD PASGTDAINP PEGPQOHSSQ QKGRLSLQNT AEIQECLVMA 60 
GDVGCEVFEC PENNSCBIRG LBGICHTFLH KACXFDAQGK SFIKDALKOC AHALSHRFGC 120 
ISRKCPAIRB MVSQIiQRECY LKRDXiCAAAQ QITRVIVWI EFXDLLLHEP YVDLVHIiliLT 180 
OGEEVKEAIT KSVQVOCEQN HGSUSILSF CTSAIQKPPT APPERQPQVD RTKLSRAHHG 240 
EAGHHLPBPS SRETGRGAKG BSGSKSHPNA HARGRVGGU3 AQGPSGSSEH EDEQSEYSDI 300 



Seq ID MO: 430 DCIA sequence 
Nucleic Acid Accession fi-: NM_005940 
coding sequence: 23.. 1489 

1 11 21 31 41 SI 

I I 1 I I ^ ' 

AAGCCCAGCA GCCCCGGGGC GGATGGCTCC GGCCGCCTGG CTCCGCaGOG OGGCCGCGCG 60 

OGCCCTCCro CCCCGGATGC TGCTGCTGCT GCTCCAGCOG CCGCOGCTGC TGGCCCGGGC 120 

TCTGCCGCCG GACXTTCCACC ACCTCCATGC OGAGAGGAGG G6GCCACAGC CCTGGCATGC ISO 

AGCCCTGCCC AGTAGCCOGG CACCTGCCCC TGCCACGCAG QAAGCCOCCC GGCCTGCCAG 240 

CAGCCTCAGG CCTCCCCGCT GTGGCGTGCC CGACCCATCT GATGGGCTGA GTGCCCGCAA 300 

COGACAGAAG AGGTTCGTGC TTTCTOGOGG GCGCTGGGAG AAGACGGACC tCACCTACAG 360 

GATCCTTCGG TTCCCATGGC A G TT G GTGCA OGAGCAGGTG CGGCAGACX3A TGGCAGAGGC 420 

CCTAAAGGTA TGGAGCGATG TGACGOCACT CACCTTTACT GAGGTGCAOG AGGGCCGTGC 480 

TGACATCATG ATCGACTTCG CCAGGTACTG GCATGGGGAC GACCTGCCX3T TTGATGGGCC S40 

TGGGGGCATC CTGGCCXATG CCTTCTTCOC CAAGACTCAC GGAGAAGGGG ATGTCCACTT 600 

CGACTATGAT GAGACCTGGA CTATCGGGGA TGACCAGGGC ACAGACCTGC TGCAGGTGGC 660 

AGCCCAT6AA TTTGGCCACG TGCTGGGGCT GCAGCACACA ACAGCAGCCA AGGCCCTGAT 720 

GTCCGCCTTC TACACCTTTC GCTACCCACT GAGTCTCAGC CCAGATGACT GCAGGGGOGT 780 

TCAACACCTA TATGGCCAGC CCTGGCCX»C TGTCACCTCC AGGACCCCAG CCCTGGGCXX 840 

CCAGGCTGGG ATAGACACC3V ATGAQATTGC ACCGCTGGAG CCRGft OGCCC CGCCAGATGC 900 

CT6TGAGGCC TCCTTTGAOG CGGTCTCCAC CATCCGAGGC GAGCTCTTTT TCTTCAAAGC 960 

GGGCTTTGTG TCGCX^CCTCC GTGGGGGCCA GCTGCAGCCC GGCTACCCAG CATTGGCCTC 1020 

TCGCCACTGG CAGGGACTGC CCAGCCCTCT GGACGCTGCC TTCGAGGATG CCCAGGGCCA 1080 

CATrrCGTTC TTCCAAGGTG CTCAGTACTG GGTGTACGAC GGTGAAAAGC CAGTCCTGGG 1140 

CCCOGCAOCC CTCACCGAGC TGGGCCTGGT GAGGTTCCCG GTCCATGCTG CCTTGGTCTG 1200 

GGGTCCCXaG AAGAACAAGA TCTACTTCTT CCGAGGCAGG GACTACTGGC GTTTCCACCC 1260 

CAGCACCCGG CGTGTAGACA GTCCCGTGCC CCGCAGGGCC ACTGACTGGA GAGGGGTGCC 1320 

CTCTGAGATC GACGCTGCCT TCCAGGATGC TGATGGCTAT GCCTACTTCC TGGGCGGCGG 1380 

CCTCTACTGG AAGTTTGACC CTGTGAAGGT GAAGGCTCTG GAAGGCTTCC OCCGTCTOGT 1440 

GGGTCCTGAC TTCTTTGGCT GTGCCGAGCC TGCCAACACT TTCCTCTGAC CATGGCTTGG ISOO 

ATGCCCTCAG GGGTGCTGAC CCCTGCCAGG CCACGAATAT CAGGCTAGAG ACCCATGGCC IS 60 

ATCTTIGTGG CTGTGGGCAC CAGGCATGGG ACTGAGCCCA TGTCTC CTGC AGGGGGATGG 1620 

GG1X3GGGTAC AACCACCA7G ACAACTGCGG GGAGGGCGAC GCAGGTOGTG GTCACCTGCC 1680 

AGGGACTGTC TCAGACTGGG CAGGGAGGCT TTGGCATGAC TTAA6AGGAA GGGCAGTCTT 1740 

GGGACCCX3CT ATGCAGGTCC TGGCAAACCT GGCTGCCCTG TCTCATCCCT GTCCCTCAGG 1800 

GTAGCACCAT GGCAGGACTG GGGGAACTGG AGTGTCCTTG CTGTATCCCT GTTGTGAGGT 1860 

TOCTTCCAGG GGCTGGCACT GAAGCAAGGG TGCTGGGGCC CCATGGCCTT CAGCCCTGGC 1920 

TGAGCAACTG GGCTGTAGGG CAGGGCCACT TCCTGAGGTC AGGTCTTGGT AGGTGCCTGC 1980 

ATCTGTCTGC CTTCTGGCTG ACAATCCTGG AAATCTGTTC TCCAGAATCC AGGCCAAAAA 2040 

GTTCACAGTC AAATGGGGAG GGGTATTCTT CATGCAGGAG ACCCCAGGCC CTGGAG6CTG 2100 

CAACATACCT CAATCCTGTC CCAGGCCGGA TCCTCCTGAA GOCCTTTTOG CA GCACIGCT 2160 

ATCCTCCAAA GCCATTGTAA ATGTGTGTAC AGTGTGTATA AACCTTCTTC TTCTTTTTTT 2220 
TTTTTAAACT GAGGATTGTC ATTAAACACA GTTGTTTTCT 



Seq ID NO: 431 Protein sequence 
Protein Accession «: NP_00S93l 

1 11 21 31 41 51 

MAPAAHIiRSA AARALLPPML LLLLQPPPLL ARALPPDVHH LHAERRGPQP WHAALPSSPA 60 

PAPATQEAPR PASSLRPPRC GVPDPSDGI.S ARNRQKRFVL SGGRWEKTDL TYRILRFPWQ 120 

LVQBQVRQTM AEALKVHSDV TPLTPTEVHB GRADIMIDPA RYWHGDDLPF DGPGGILAHA 180 

FFPKTHREGD VHFDYDETWT IGDDQGTDLL QVAAHEFGHV U3LQHTTAAK ALMSAPYTFR 240 

YPIjSLSPDDC RGVQHLYGQP WPTVTSRTPA LGPQAGIDTN BIAPLETOAP PDACEASFDA 300 

VSTIRGELFP PKAGFVWRLR GGQLQPGYPA LASRHHQGLP SPVDAAFEDA QGHIWFFQGA 360 

QYWVYDGEKP VLGPAPLTEL GLVRFPVKAA LVHGPEKNKI yPFRGRDYWR FHPSTRRVDS 420 

PVPRRATDWR GVPSBinAAF QDAD6YAYPL RGRLYWKFDP VKVKALBGFP RLVGPDFFGC 480 
AEPANTFL 



Seq ID NO: 432 DNA sequence 
Nucleic Acid Accession 8: NM_024022 
Coding sequence: 202.. 1563 



1 11 21 31 41 51 

ACCGGGCACC GGACGGCTCG GGTACTTTCG TTCTTAATTA GGTCATGCCC GTGTGAgZCA 60 

GGAAAGGGCT GTGTTTATGG GAAGCCAGTA ACACTGTGGC CTACtATCTC TTOCGTGGTG 120 

CCATCTACAT TTTTGGGACT CGGGAATTAT GAGGTAGAGG TGGAGGCGGA GCCGGATGTC 180 

AGAGGTCCTG AAATACTCAC CATGGGGGAA AAT6ATCCGC CTGCTGTTGA AGCCOCCTTC 240 

TCATTCCGAT OGCmTrGG CCTTGATGAT TTGAAAATAA GTOCIGTTGC ACCAGATGCA 300 
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GATGCTGTTC CTGCACAGAT CCTGT CA CTG CTGCCATTCA AGTTTrTTCC AATCAT OGTC 360 

ATTGGGATCA TTGCATTGAT ATTAGCACTG GCCATTOGTC TGGGCATCCA CTTCGACTCC 420 

TCAGGGAAGT ACWSATCTCG CTCATCCTTT AAGTGTATCG ACCTGATAGC TCGATGTGAC 480 

GGftCTCTCGG ATTCaVAAGA OQGGGAGGAC GAGTACOGCT GTGTCCGGGT GGGTCGTCAG 540 

AATGCCGTGC TCC3VGGTGTT CACACSCTGCT T0GTCGAA<3A CCATGTGCTC CGATGACTOG 600 

AAGGGTCACT ACGCAAATGT TGCCTGTGCC CAACEGGGTT TCCC RAGCTA TGTGAGTTCa 660 

GATAACCTCA GAGTCAGCTC GCTGGAGGGG CAGTTCCGQG AGGAGTTTGT GTCCATOSAT 720 

CACCTCTTCC CAGATGACAA GGTGACTGCA TTACACCACT CAGTATATGT GAGGGAGGGA 780 

T G T G CCTCrG GCCACGTCGT TACCTTCCAG TGCAOVGCCT GTGGTCATAG AAGGGGCTAC S40 

AGCTCACGCA TCJGTCGGTGg' AAACATGTCC TTGCTCTCCC AGTCGCCXnC GCAGGCCAGC 900 

CTTCAGTTCC AOGGCTACCA CCrGTGOGGG GGCTCTGTCA TCRCGCCCCT GTG GATCATC 960 

ACTGCTGCAC ACTGTCTTTA TGACTTGTAC CTCCCCAAGT CATGGACCAT CCftGCTGCSCT 1020 

CTAGTTTCCC TCTTGGACAA TCCAGCCCCA TaXACTTGG TCGAGAAGAT TGTCTACCAC 1080 

AGCAAGTACA AGCCAAAGAG GCTGGGCAAT GACATCGCCC TTATGAAGCT GGCOGGGCCA 1140 

CTCACGTTCA ATGAAATCAT CCAGCCTGTG TGCCTGCCCA ACTCTGAAGA GAACTTCXXC 1200 

GATGGAAAAG TOTOCTOJAC GTCAGGATGG GGGGCCACAG AGGATGGAGG TGACGCCTCC 1260 

CCTGTCCTGA ACCACGOGGC CGTOCCTTTG ATTTCCAACA AGATCTGCAA CCACAGGGAC 1320 

GTGTACGGTC GCATCATCTC CCCCTCCATG CTCTGCGCGG GCTACCTGAC GGGTGGQgPG 1380 

GACAGCTGCC AGGGGGACAG OSGGGGGCCC CTGGTGTGTC AAGAGAGGAG GCTGTQC»AG 1440 

TTAGTGGGAG GGACCAGCTT TGGCATOGGC TGCGCAGAGG TGAACAAGCC TGGGGTGTAC ISOO 

AOCOGTGTCA CCTCCTTCCT GGACTGGATC CACGAGCAGA TGGAGAGAGA CCTAAAAACC 1560 

TGAAGAGGAA GGGGACAAGT AGCCACCTGA GTTCCTGAGG TGATGAACAC AGCCCXy^TCC 1620 

TCCCCTGGAC TCCayTOTAG GAACCTGCAC AOGAGCAGAC ACCCTTGGAG CTCTGACTTC 1680 

CGGCACCAGT AGCAGGCCCG AAAGAGGCAC CCTTCCATCT GATTOCAGCA CRAOCTTCAA 1740 

GCTGCTTTTT GmTTTCTT TTTTTGAGGT GCSAGTCfOGC TCTGTTGCCC AG6CTGGAGT 1800 

GCAGTGGCGA AATCCCTGCT CACTGCAGCC TCC3GCTTCCC TGGTTCAAGC GATT CTCTTG I860 

CCTCAGCTTC CCCAGTAGCT GGGACCACAG GTGCCCGCCA CCACACCCAA CTAATTTTTG 1920 

TATTTTTAGT AGAGACAGGG TTTCACCATG TTCGCCAGGC TGCTCTCAAA CCCCTGACCT 1980 

CAAATGATCT GOCTGCTTCA GCCSCOChCA GTGCTGGGAT TACRGGCATG GGCCAOCACG 2040 

CCTAGCCTCA OSCTCCTTTC TGATCTTCAC TAAGAACAAA A6AAGCAGCA ACTTGCAAGG 2100 

GCGGCCTTTC CCACTGGTCC ATCTGGTTTT CTCTCCAGGG GTCTTGCAAA ATTCCTGACG 2160 

AGATAAGCAG TTATGTGACC TCACGTGCAA AGCCACCAAC AGCCACTCAQ AAAAGAGCSCA 2220 

CCAGCCX»GA AGTGCAGAAC TGCAGTCACT GCACGTTTrC ATCTCTAGGG ACCAGAACCA 2280 

AACCCACCCT TTCTACTTCC AAGACTTATT TTCACATGTG GGGAGGTTAA TCTAGGAATG 2340 

ACTOGTTTAA GGCCTATTTT CATGATTTCT TTGTAGCATT TGGTGCTTGA CGTATTATTG 2400 

TCCTTTGATT CCAAATAATA TGTTTCCTTC CCTCAAAAAA AAAAAAAAAA AAAAAAAAAA 2460 



Seg XD KO: 433 Protein sequence 
Protein Accession 9 : NP_076927 

1 11 21 31 41 51 

MGENDPPAVE APPSFRSLFG LDDLKISPVA PDADAVAAQI LSLLPLKFFP IIVIGIIALI 60 

LALAIGLGIH FDCSGKYRCR SSFKCIELIA RCDGVSDCKD GEDEVRCVRV GGQNAVLQVF 120 

TAASWKTMCS DDWKGHYANV ACAQLGFPSY VSSDNLRVSS LEGQFREEFV SIDHLLPDDK 180 

VTALHHSVYV REGCASGHW TLQCTACGHH RGYSSRIVGG NMSU.SQWPW QASLQFQGYH 240 

LOGGSVITPL WIITAAHCVY DLYLPKSWTI QVGLVSLLDN PAPSHLVEKI VYHSKYKPKR 300 

LGNDIALMKL AGPLTFNEMI QPVCI*PNSEB NFPDGKVCWT SGWGATEDGG DASPVLNHAA 360 

VPLISNKIQi HRDVYGGZZS PSMLCAGYLT GGVDSCQGDS QGPLVCQERR LWCLVGATSP 420 
GIGCAEVNKP GVYTRVTSPIi DWIHEQMEBD LKT 



Seq ID NO: 434 DNA sequence 

Wucleic Acid Accession »j NM_000493,2 

Coding sequence: 97.. 2 13 9 

1 11 21 31 41 51 

CACCTTCTGC ACTGCTCATC TGGGCAGAGG AAGCTTCAGA AAGCTGCCAA GGCAC CATC T 60 

CXaCGAACTC CCAGCACGCA GAATCCATCT GAGAATATGC TGCCACAAAT ACCCTTTTTG 120 

CTGCTAGTAT CCTTGAACTT GGTTCATGGA GTGTTTTACG CTGAACGATA CCAAATGCCC 180 

ACAGGCATAA AAGGCCCACT ACX3CAACACC AAGACACAGT TCTTCATTCC CTACACCATA 240 

AAGAGTAAAG GTATAGCAGT AAGAGGAGAG CAAG6TACTC CTGGTCCACC AGGCCCTGCT 300 

GGACCTCGAG GGCACCCAGG TCCTTCTGGA CCACCAGGAA AACCAGGCTA OQGAAGTCCT 360 

GGACTCCAAG GAGAGCCAGG GTTGCCAGGA CCACCGGGAC CATCAGCTGT AGGGAAACCA 420 

GGTGTGCC3VG GACTCCCAGG AAAACCAGGA GAGAGAGGAC CATATGGACC AAAAGGAGAT 480 

GTTGGACCAG CTGGCCTACC AGGACCCCGG GGCCCACCAG GACCACCTGG AATCCCTGGA 540 

CCGGCTGGAA TTTCXGTGCC AGGAAAACCT GGACAACRGG 6ACCCACAGG AGCCCCAGGA 600 

CCCAGGGGCT TTCCTGGAGA AAAGGGTGCA CCAGGAGTCC CTGGTATGAA TGGACAGAAA 660 

GGGGAAATGG GATATGGTGC TCCTGGTCGT CCAGGTGAGA QGGGTCTTOC AGGOCCTCAG 720 

GGTCCCACAG GACCATCTGG CCCTCCTGGA GTGGGAAAAA GAGGTGAAAA TGGGGTTCCA 780 

GGACAGCCAG GCATCAAAGG TGATAGAGGT TTTCCGGGAG AAATGGGACC AATTGGCCCA B40 

CCAGGTCCCC AAGGCCCTCC TGGGGAACGA GGGCCAGAAG GCATTGGAAA GCCAGGAGCT 900 

GCTCGAGCCC CAGGCCAGCC AGGGATTCCA GGAACARAAG GTCTCCCTGG GGCTCCAGGA 960 

ATAGCTGGGC CCCCAGGGCC • TCCTGGCTTT GGGAAACCAG GCTTGCX»GG CCTGAAGGGA 1020 

GAAAGAGGAC CTGCTGGCCT TCCTGGGGGT CCAGGTGCCA AAGGGGAACA AGGOOCAGCA 1080 

GGTCTTCCTG GGAAGCCAGG TCTGACTGGA CCCCCTGGGA ATATGGGACC CCAAGGACCA 1140 

AAAGGCATCC CGGGTAGCCA TGGTCTCCCA GGCCCTAAAG GTGAGACAGG GCCAGCTGGG 1200 

CCTGCAGGAT ACCCTGGGGC TAAGGGTGAA AGGGGTTCCC CTGGGTCAGA TGGAAAACCA 1260 

GGGTACCCAG GAAAACC3UX5 TCTCGATGGT CCTAAGGGTA ACCCAGGGTT ACCAGGTCCA 1320 

AAAGGTGATC CTQGAGTTGG AGGACCTCCT GGTCTCCCAG GCCCTGTGGG CCCAGCAGGA 1380 

GCAAAitSGGAA TGCCCGGACA CAATGGAGAG GCTGGCCCAA GAGGTGCCCC TGGAATACCA 1440 

GGTACTAGAG GCCCTATTGG GCCACCAGGC ATTCCAGGAT TCCCTGGGTC TAAAGGGGAT 1500 

CCAGGAAGTC CCGGTCCTCC TGGCCCAGCT GGCATAGCAA CTAAGGGCCT CAATG6ACCC 1560 

ACCGGGCCAC CACGGCXTTCC AGGTCCAAGA GGCCACICTG GAGAGC CTGG TCTTCCAGGG 1620 

OCCCCTGGGC CTCCAGGCCC ACCAGGTCAA GCAGTCATGC CTGAGGGTTT TATAAAGGCA 1680 

GGCCRAAGGC CCAGTCTTTC TCGGACCCCT CTTGTTAGTG CCAACCAGGG GGTAACAGGA 1740 
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ATGCCrGTGT crCCTTTTAC TGTTATTCTC 
ATAOCATTTG ATAAAATTTT GTATAACAGG 
TTTACTTGTC AGATACCAGG AATATACTAT 
CATGTTTGQG TACGCCTGTA TA AGAAT GGC 
ACCAAAGGCT ACCTGGATCA GGCTTCAGGG 
CAGGTGTGGC TCCAGCTTCC CAATGCOGAG 
CACTCCTCTT TCTCAGGATT CCTAGTGGCT 
TAAATCTTGT GCTAGAAAAA GCATTCTCTA 
AGGTAGGCTG AAAAGAATGT AATTTTTATT 
AACAAACCTT CCCCCTGAAA AGTGAGCAGC 
AATTTCTAGT TAGCAATCTT AAGGCTCTTT 
CAAAGAAGTC CTGCTATGTT AAAAACAAAC 
TAAAAAAAAA AACAGAAATA G AGCTC TAAG 
ATTTCCTTTT TAAAAAAGCC TGTTTCTAAC 
AGGAGGTATC ATATAACTTT GTAGAACTTA 
TOTATCCCCT AAAATATTTC TGATGGTGCA 
CAATATCTAT TCAAATATAC AGGTGCATAT 
CCCAAAATAT TGAAGTTCAT CTGAAATGCA 
CTTTTCTATG ATTGCAGAGA AGCTTTTTAT 
GACCTATTCT TATrrAGTTA ACACAAGTGT 
AATCTTATGT GATATCATTT TCTGGATTTA 
CCATTCAAGT GAAGTTATAA TTTACACTGA 
ATATTATTTA TTTATGCACT GTAC TGTAT T 
TGCCTCACTT ATTAAAGCAC AAAATGTTTT 
AACATCAATA GATTTTTAGG CTQAATTAAT 
TTCTTTCAAG GCTTTTCATT CGACACAATA 



•nXAAAGCTT 
CAACACCATT 
TTTTC3VTACC 
ACCCCTGTAA 
AGTGCCATCA 
TCAAATOGCC 
CCAATGTGAG 
ACTCTACCXTC 
TTCTGAAATA 
AAC6TAAAAA 

AAGGrrrrcT 

AACAAAAAAC 

TTATGTGAAA 
TATGAATATG 
AATACTTGAA 
CTACTCTGAG 
ATACTTGTTA 
AGGTGdTTC 
ATACCCAGCA 
GATTAATTTG 
CAGAACATTA 
GGGTTTCAAA 
TTTATATTGC 
AOCTACTCCT 
TTGAAAGGAG 
AAATAACATC 



AOCX»GCAAT 
AT6ACCCAAG 
ACGTGCATGT 
TGTACRCCTA 
TOGATCTCAC 
TATACTGCTC 
TACACXXTAC 
ACCCTACAAA 
CAGATTTGAG 
OGTATGTGAA 
CCAATATTAA 
AAAG CAACA A 
TTTGATTTGA 
AGAACTTCTA 
TATTCAAATT 
GCCTGTATGG 
AW3CTCTTAT 
ATCAAT6AAC 
TAAC TTGG AA 
ATTTCTTTAA 
GCACATGTAC 
ATTOGACTAG 
TGTTtAAAAC 
TATTTACGAC 
CAATTTGCTG 
AATAG 



AGGAAdCCC 
GACTGGAATC 
GAAAGGGACT 
TGATGAA3AC 
AGAAAAtXSAC 
7GAGTATGTC 
AGAGCTAATC 
ATCCATATGG 
CTATCAGACC 
GCCTCTCTTG 
AAAATATCAC 
AAAAAAAAAT 
GAAACTOGGC 
GGAAACATCC 
TAAAAGACAC 
CCCCTTTCAT 
ATAAAAAAGC 
CTTTTCAAAA 
ACAGGTATCT 
TTCCTTATTG 
CTTGTGCCTC 
AAGTGGAGAT 
TTTTAAGCTG 
ACAATAAAAT 
TTCTCAACCA 



leoo 

1660 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2-760 
2820 
2880 
2940 
3000 
3060 
3120 
31B0 
3240 



Seq ID NO: 435 Protein sequence 
Protein Accession 9: KP_000484.2 



MX^IPFUiL 
TPGPPGPAGP 
GPYGPKGDVG 
VPGMNGQKGE 
GEMGPIGPPG 
PGLPGLKGER 
KGETGPAGPA 
PGPVGPAGAK 
ATKGLNGPTG 
SANQGVTGKP 
YHVHVKGTHV 
GLYSSEYVHS 



11 
1 

VSLNLVHGVF 
RGHPGPSGPP 
PAGLPGPRGP 
MGYGAPGRPG 
FQGPPGERGP 
GPAGLPGGPG 
GYPGAKGERG 
GMPGHNGEAG 
PPGPPGPRGH 
VSAFTVIIiSK 
WVGLYKNGTP 
SFSGFLVAPM 



21 

1 

YAERYQMPtG 
GKPGYGSPGIi 
PGPPGIPGPA 
ERGLPGPQGP 
BGIGKPGAAG 
AKGBQGPA6L 
SPGSDGKPGY 
PRGAPGIPGT 
SGEPGI«PGPP 
AYPAIGTPIP 
VMVTYDEYTK 



31 
I- 

IKGPLPNTKT 
QGEPGLPGPP 
GISVPGKPGQ 
TGPSGPPGVG 
APGQPGIPGT 
PGKPGI/TGPP 
PGKPGLDGPK 
RGPIGPPGIP 
GPPGPPGQAV 
FDKILYNRQQ 
GYLDQASGSA 



41 
I 

QFPIPYTIKS 
GPSAVGKPGV 
QGPTGAPGPR 
KR6ENGVPGQ 
XGLPGAFGIA 
(3IKGPQGPKG 
GNPGLPGPKG 
GFPGSKGOPG 
HPEGFIKAGQ 
HYDPRTGIPT 
IIDLTElimQV 



51 
I 

KGXAVRGBQG 
PGZiPGKPGER 
GPPGEKGAPG 
PGIKGDRGPP 
GPPGPPGFGK 
IPGSHGIiPGP 
DPGVGGPPGL 
SFGPPGPAGX 
RPSLSGTPLV 
CQIPGIYYFS 
HLQLFNABSN 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



Seq ID NO: 436 DNA sequence 
Nucleic Acid Accession 6: XN_062811 
Coding sequence: 1..888 



1 
I 

ATGTGGGGCG 
CTGCTGCTGG 
TGGCTG6ACG 
GGGGACGCCA 
GCGCGCCTGG 
CGGGOGGACA 
GTTGGCTCCX3 
AGATGTCTCC 
ATGGAGACCA 
TCCAGCACAG 
AGGTCACAGA 
CCCACGAATT 
CAGTATCTGC 
GCTGTGCCAC 
GCTCACACCA 



11 
I 

CTCGCCGCTC 
CTGCGCTGCT 
CGCAGGGCGT 
OCATCTGCTG 
ACCAGGGOGG 
AAGACGGCCC 
TGTTTGTCGC 
GGCCTAAGCA 
TCCCCATGAT 
CTGCCAGTTC 
CCAACTGrCG 
TCTCTGTGCT 
ATCCCXrCATA 
CTTTCATGGA 
ACAGTGAACA 



21 
I 

GTCCGTCTCC 
GGCGGCGGGG 
CTGGCGCATC 
CX3GCAGCTGC 
CTGCGACAAT 
CGACGGCTCG 
CTTTATCATC 
GGATCCCCAG 
CCCCAGTGCC 
CAGCTCCAGC 
CTTGCCGGAA 
GAACTGTCAG 
0GTG6GGTAC 
CGGCCT6CAG 
GAAGATGTAC 



31 

1 

TCATCCTGGA 
GCGAGGGCCA 
GGCTTCCAGT 
GCGTTGCGCT 
GACXGCCAGC 
GCAGTGCCCA 
TTGGGGTCCC 
CA6AGC0GAG 
AGCACCTCCC 
GCCAACTCAG 
GGGACCATGA 
CAG6CCACCC 
AOGGTGCAGC 
OCTGGCTACA 
OCAGOGGTGA 



41 

I 

ACX5CCGCTTC 
G06GGGAGTA 
GTCCCGAGOG 
ACTGCTGCTC 
AGGGOGCTGG 
TCTAOGTGCC 
TGGTOGCAGC 
CXTCCAOGGGG 
GGGGGTOGTC 
GGGCCOGGGC 
ACAA0C3TGTA 
AGATTGTGCC 
ACGACTCTGT 
GGCAGATTCA 
CTGTATAA 



51 
I 

GCTCCTGCAG 
CTGCCAOGGC 
CTTG6A0GGC 
CAGCGCCGAG 
QGAGCCTGGC 
GTTCCTCATT 
CTGTTGCTGC 
TAACCGCTTG 
CrCACXSCCAG 
GCCCCCAACA 
TGTCAACATG 
ACATCAAGGG 
GCCCATGACA 
GTCCCCCTTC 



60 
120 
180 
240 
300 
360 
420 
480 
S40 
600 
660 
720 
780 
S40 



Seq ID NO: 437 Protein sequence 
Protein Accession |i XP_062811 



1 11 

I I 
MWGARRSSVS SSWNAASLLQ 
GDATICCGSC ALRYCCSSAE 
VGSVFVAFII LGSLVAACCC 
SSTAASSSSS AKSGARAPPT 
QYIiHPPYVGY TVQHDSVPMT 



21 31 41 SI 

! I I i 

LLLAALIiAAG ARASGEYCHG WLDAQGVWRI GFQCPERFDG 
ARUIQGGCDN DRQQGAGEPG RADKDGPDGS AVPIYVPFLI 
RCLRPKODPQ QSRAPGGNRl. METIPMIPSA STSRGSSSRQ 
RSQTNCCLPE GTMNNWVNM PTNPSVLNOQ QATQIVPHQG 
AVPPPMDGLQ PGYRQIQSPF PKTNSBQKMY PAVTV 



Seq ID NO: 438 DNA sequence 

Nucleic Acid Accession ft: NM_004004.1 

coding sequence: 1..681 

1 11 21 31 41 SI 

I I I 1 > 1 

ATGGATTGGG GCACGCTGCA GACGATCCTG GGGGGTGTCA ACAAACACTC CACCAGCATT 
GGAAAGATCT GGCTCACOGT CCXCTTCATT TTTOQCATtA TGATCCTOGT J^T^G^ 

aaggajggtgt gggoagatga gcaggccgac tttgtctgca acacccigca gccaggctgc 



60 
120 
180 
240 



60 
120 
180 



349 



wo 02/086443 

AAGAACGTGT GCTAOGATCA CTACTTCCCC ATCTCCCACa TCOGGTOTG ^CCTGCAG 240 

J^StStOG TCTCCAGCCC AGCXSCTCCTA GTCGOCATCC AOGTGGCCTA COGGAGACAT 300 

GftGAAGAAGA GGAAGTrCAT CAAGGGGGAG ATAAAGAGTO AATTTAAGGA CATOGAGGAG 360 

^S^^ AGAAGGTCOS CATOGAAGGC TCCCTGTGGT GGACCTACAC AAGaG^TC 420 

TTCTTCOGGG TCATCTTCGA AGCOGCCTTC ATGTAOGTCT TCTATGTCAT CTACGAOGGC 480 

ScTOCATOC AGOGGCTGGT GAAGTGCAAC GCCTGGCCTT CTCCCAACAC ^^^GGACTCC 540 

TTTGTGTCCC GGCCCAOGGA GAA6ACTGTC TTCACAGTGT TCATGATTGC ^J^^^ "° 

StTCCATCC TCCTGAAWP CAdGAATTC TGTTATTTOC TAATTAGATA TTCTTCTOGG 660 
AAGTCAAAAA AGCCAGTrPA A 



Seq ID NO: 439 Protein sequence 
Protein Accession ft: NP_003995-1 



U 21 31 |1 51 

Lgtlqtil Lnkhstsi gkiwltvlfi fR^vvaa ^-^eoad fvc^p^ 

KNVCYDHYFP ISHIRLHALQ LIFVSSPALL VAMHVAYRBH EKKRKFIKGE IKS^IEE 

?S?SShiS slhwtytssi pfrvipeaaf hwyvmydg fskqrlvkch awpcpstvdc 

FVSRPTEKTV FTVFMIAVSG iClLLtJVTBL CYLLIRYCSG KSKKSV 



60 
120 
180 



Seq ID NO: 440 UNA sequence 

Kucleic Acid Accession S: XM^061091.1 

Coding sequence: 1..24B1 



'66 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
T20 
760 
840 
900 



1 11 21 |i 1^ f 

ATGCCAAATA CTTCAGGAAC iACCAGGATT GAAA TTTGGC TTCTCCAA^ 552^^ 
CAcSaGOGC TGGTOGCCGC TCTCCTTCCG GTGAGTCCCA GCCCOGACrT GGCTCTGGOG 

cccSgtacc ogccagtgcc ggctgccgat gaccgattca cgctcccgat gattggagct 

O^TGCATG GTGAGAAGGT AGATCTCTGG AGCCTTCGTG TTCTTrGCTA TGAATTTTTA 
GTTGGGAAGC CTCCTTTTGA GGCAAAOGAA GTCCATGTAA GCAAAGAAAC CATCGG6JA 
AmScCTG CCAGCAAAAT GATGTGGTGC TCGGCTGCAG TGGACATCAT GTTTCTGTTA 

Stggctcta acagcgtcgg gaaagggagc tttgaaaggt ccaagcactt tgccatcaca 

ctSgtScG GTCTGGaSt CAGCCCCX5AG AGGGTCAOAG TGGGAGCATT CCAGTTCAGT 

SSS^ atctggaatt ccccrreGA^r tcattttcaa cccaacagga astg^^^ 

AGAATCAAGA GGATGGTTTT CAAAGGAGGG CGCACGGAGA CXK3AACTTGC J^^lf^ 
CTTCTGCACA GAGGGTTGCC TGGAGGCAGA AATGCTTCTG TGCCCCAGAT CCTCATCATC 

SSSStc gga^t^ gggggatctg gcactgccat ccaagcagct gaaggaaagg 
^StSctg tgtttgctgt gggggtcagg tttcccaggt gggaggagct gcatgcactg 
^^gI ^riaAGGGCA gcacgtcctg ttggctgagc aggtggagga tgccaccaa^ 
SCCTCTTCA gcaccctcag cagctcggcc atctgctcca gcgccaogcc agctgggagc 

SSSctTG TCTTCATGGA GCX3GTTAATG GGCATCOT 

SgCCSGCC AGAATGGAGG CACATGTGTT CCAGAAGGAC TCGACGGCTA CCAGTGCCTC 1020 

™3GAGG GGAGGCTAAC TGTCCCCTGA AG^ 1080 

TCITCCTGCT GGACAGCTCT GOGGGCACCA CTCTGGACGG CTTCCTGCGG 1140 

GcSaAGTCT TCGTGAAGCQ GTTTCTGCGO GCCGTCCTGA GCGAGGACTC TCGGGCCCGA 1200 

GTGGGTOTGG CCACATACAG CAGGGAGCTG CTGGIOGCGG TGCCTOTGGG GGAGTACCAG 1260 

GATGTGCCTG ACCTGGTCTG GAGCCTCGAT GGCATTCCCT TCCGTGGTGG CCCCACCCTG 1320 

^SSS^TC COTGCGGCA GGCGGCAGAG COrGGOT^ "80 

CAGGACCGGC CACGTAGAGT GGTCGTTTTG CTCACTGAGT CACACTCGGA GGATGAGGTT 1440 

GOGGGCCXaG OGCGTCAOK: AAGGGCGOSA GAGCTGCTCC TGC^ 1500 

GCC6TGCGGG CAGA6CTG6A GGAGATCACA GGCAGCCCAA AGCATGTGAT GGTCTACTCG 1560 

GATCCTCAGG ATCTGTTCAA CCAAATCCCT GAGCTGCAGG GGAAGCTGTG CAGCCGGCAG 1620 

?ScSSot GcosGACACA AGCCCTGGAC ciamrrcA TO 1680 

GTAGGGCCCG AGAATTTTGC TCA6ATGCAG AGCTTTGTGR GAAGCTGTGC CCTCCAGTTT 1740 

GAGGTGAACC CTGACGTGAC ACAGGTCGGC CTGGTQGTGT ATGGCAGCCA <3GTGCRGACT 1800 

GCCTTCGGGC TGGACACCAA ACCCACCCGG GCTGCGATGC TGCGGGCCAT TAGCCAGGCC 1860 

CCCTACCTAG GTGGGGTGGG CTCAGCCGGC ACCGCCCTGC TGCACATCTA TGACAAAGTG 1920 

aSScS?^ ^GGGGTGC CCGGCCTGGT GTCCCCA^ 1980 

GGGAGAGGCC CAGAGGATCC AGCCGTTCCT GCCCAGAAGC TGAGGAAC^ 2040 

GTCTTGGTCG TGGGCGTCGG GCCTGTCCTA AGTGAGGGTC TGCGGAGGCT TGCAGGTCOC 2100 

CGGGATTCCC TGATCCACGT GGCAGCTTAC GCCGACCTGC GGTACCACCA GGAOGTGCTC 2160 

ATTGAGTGGC TGTGTGGACA AGCCAAGC^ "20 

ATGAATGAGG GCAGCTGCGT CCTGCAGAAT GGGAGCTACC GCTGCAAGTG TCGGGATGGC 2280 

^SS^ SSctgSa GAACCGTCAG TGGAGCT^ 2340 

GGAT^TTC TTGAGACGCC CCTGAGGCAC ATGGCTCCCG T6CAGGAGGG CAGCAGCCGT 2400 

ACCCCTCCCA GCAACTACAG AGAAGGCCTC GGCACTGAAA TCGTOCXTTAC CTTCTGGAAT 2460 
GTCTGTGCCC CAGGTCCTTA G 

Seq ID NOs 441 Protein sequence 
Protein Accession #: XP_061091.1 

1 11 21 31 41 51 

MPNTSGTTRI EIWLLQEPPG HRALVAALLP VSPSPELALA PGYPPVPAAD DRPTLPMIGG 60 

QMHGEKVDLW SLGVLCYEFL VGKPPPEANE VHVSKETIGK ISAASKMMWC SAAVDIMPLL 120 

DGSNSVGKGS PERSKHFAIT VCDGLDISPE RVRVGAFQFS STPHLEFPLD SFSTQQEVKA 180 

RIKRMVFKGG RTETELALKY LLHRGLPGGR NASVPQIHI VTDGKSQGDV ALPSKQLKER 240 

GVTVFAVGVR FPRWEELHAL ASEPRGQHVL LAEQVEDATO GLFSTLSSSA ICSSATPAGS 300 

PELVFHERLM GISLIGPCDS QPCQMGGTCV PBGLOGYQCL CPLAFGGEAN CALKLSLECR 360 

VDLLFLLDSS AGTTLDGFLR A2CVFVKRFVR AVLSEDSRAR V6VATYSREL LVAVPVGEYQ 420 

DVPDLVWSLD GIPFRGGPTL TGSALRQAAE RGFGSATRTG QDRPRRVWL LTBSHSEDEV 480 

AGPARHAHAR EliLliGVGSE AVRAELEEIT GSPKHVMVYS DPQDLPNQIP ELQGKLCSRQ 540 

RPGCRTQALD I.VPMLDTSAS VGPENFAQMQ SFVRSCALQF EVNPDVTQVG LWYGSQVQT 600 

AF^Sra AATtt^ISQA PYLGGVGSAG TALLHIYDKV MTVQRGARPG VPKAWVLIG 660 

CTGAHJAAVP AQKLRNNGIS VLWGVGPVL SBGLRSIAGP RDSLIHVAAY ADLRYHQDVL 720 



350 
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WO 02/086443 

lEHLOGEAKQ PVNLCKPSPC MNEGSCVLQN GSYRC3CCRDG t-reX5PHCENRE WSSCSVCVSQ 
GWILETPLHH MAPVQEGSSR TPPSNYREGI, GTEMVPTFMN VCAPGP 

Seq ID NO: 442 DMA sequence 

Nucleic Acid Accession S: Eos sequence 

Coding sequence: I.. 2424 



PCTAJS02/12476 



ATGCCCCCTT 
TCTCTCCCTC 
AGCAAAATGA 
AGOGTOGGGA 
CTGGACATCA 
CTGGAATTCC 
ATGGTTTTCA 
GGt3TTGCCTG 
AAGTCCCAGG 
TTTGCTGTGG 
A6AGGGCAGC 
ACCCTCAGCA 
CCCTGTGAGC 
AGAGGATOGC 
AGAGTGTTCC 
TOGCAGCCCT 
CTCTGCCCGC 
AGGGTCGACC 
CGGGCCAAAG 
CX3AGTGGGTG 
CAGGATGTGC 
CTGACGGGCA 
GGCCAGGACC 
GTTGCGGGCC 
GAGGCOGIGC 
TCGGATCCTC 
CAGCGGCCAG 
TCAGTAGGGC 
TTTGAGGTGA 
ACTGCCTTC6 
GCCCCCTACC 
GTGATGACCX5 
GGCGGGAGAG 
TCTGTCTTGG 
CCCCGGGATT 
CTCATTGAGT 
TGCATGAATG 
GGCTGGGAGG 
CAGGGATGGA 
OGTACCCCTC 
AATGTCTGTG 



11 

! 

TCCTGTTGCT 
TCCAGGAAGT 
TGTGGTGCTC 
AAGGGAGCTT 
GCCCCGAGAG 
CCTTGGATTC 
AAGGAGGG06 
GAGGCAGAAA 
GGGATGTGGC 
GGGTCAGGTT 
ACGTGCTGTT 
GCTGGGCCAT 
ACAGGAOGCT 
GGCGGACXXT 
TAACCCACCC 
GCCAGAATGG 
TGGCCTTTGG 
TCCTCTTCCT 
TCTTCGTGAA 
TGGCCACATA 
CTGACCTGGT 
GTGCCTTGOG 
GGCCACGTAG 
CAGOGCGTCA 
OGGCAGAGCT 
AGGATCTGTT 
GGTGCCGGAC 
CCGAGAATTT 
ACCCTGACGT 
GGCTGGACAC 
TAGGTGGGGT 
TCCAGAGGGG 
GCGCAGAGGA 
TCGTGGGOGT 
CCCTGATCCA 
GGCTGTGTGG 
AGGGCAGCTG 
GCCCCCACTG 
TTCTTGAGAC 
CCAGCAACTA 
CCXXAGGTCC 



21 
1 

GGAGGCCCTC 
CCATGTAAGC 
GGCTGCAGTG 
TGAAAGGTCC 
GGTCAGAGTG 
ATTTTCAACC 
CACGGAGAGG 
TGCTTCTGIG 
ACTGCCATCC 
TCCCAGGTGG 
GGCTGAGCAG 
CTGCTCCAGC 
GGA6ATGGTC 
TGCGGTGCTG 
TGCCACCTGC 
AGGCACATGT 
AGGGGAGGCr 
GCTGGACAGC 
GCGGTTTGTG 
CAGCAGGGAG 
CTGGAGCCTC 
GCAGGOGGCA 
AGTGGTGGTT 
CGCAAGGGGG 
GGAG6AGATC 
CAACCAAATC 
ACAAGCCCTG 
TGCTCAGATG 
GACACAGGTC 
CAAACCCACC 
GGGCTCAGCC 
TGCCCGGCCT 
TGCAGCCGTT 
GGGGCCTGTC 
OGTGGCAGCT 
AGAAGCCAAG 
CGTCCTGCAG 
CGAGAACCXrr 
GGCCCTGAGG 
CA6AGAAGGC 
TTAG 



AAAGAAAGCA 
GACATCATGT 
AAGCACTTTG 
GGAGCATTCC 
CAACAGGAAG 
GAACTTGCTC 
CCCCAGATCC 
AAGCAGCTGA 
GAGGAGCTGC 
GTGGAGGATG 
GCCACGCCAG 
OGGGAGTTCG 
GCTGCACACT 
TACAGGACCA 
GTTCCA6AAG 
AACTGTGCCC 
TCTGCGGGCA 
OGGGCCCTGC 
CTGCTGGTGG 
GATGGCATTC 
GAGCGTGGCT 
TTGCTCACTG 
OGAGAGCTGC 
ACAGGCAGCC 
CCTGAGCTGC 
GACCTCGTCT 
CAGAGCTTTG 
GGCCTGGTGG 
CGGGCTGCGA 
GGCACCGCCC 
GGTGTCCCCA 
CCTGCCCAGA 
CTAAGTGAGG 
TACGCCGACC 
CAGCCAGTCA 
AATGGGAGCT 
GAGTGGAGCT 
CACATGGCTC 
CTGGGCACT6 



41 
I 

TGTrrrccAG 

TOSGGAAGAT 
TTCTGTTAGA 
CCATCACAGT 
AGTTCAGTTC 
TGAAGGCAAG 
TGAAA^IACCT 
TCATCATCGT 
AGGAAAGGGG 
ATGCACTGGC 
CCACCAACGG 
ACTGCAGGGT 
CTGGCAATGC 
GTCCCTTCTA 
CCT6GCCAG6 
GACTGGACGG 
TGAAGCTGAG 
CCACTCTGGA 
TGAGOGAGGA 
OGGTGCCTGT 
CCTTCCGTGG 
TOGGGAGCXX: 
AGTCACACTC 
TCCTGCTGGG 
CAAAGCATGT 
AGGGGAAGCT 
TCATGTTGGA 
TGAGAAGCTG 
TGTATGGCAG 
TGCTGCGGGC 
TGCTGCACAT 
AAGCTGTGGT 
AGCTGAGGAA 
GTCTGOGGAG 
TGCGGTACCA 
ACCTCTGCAA 
ACCGCTGCAA 
CTTGCTCTGT 
CCGTGCAGGA 
AAATGGTGCC 



51 
I 

AGTGCCCCCA 
TTCAGCTGCC 
TGGGTCTAAC 
CTGTGACGGT 
CACTCCTCAT 
AATCAAGAGG 
TCTGCACAGA 
CACTGATGGG 
TGTCACTGTG 
CAGCGAGCCT 
CCrCTTCAGC 
CGAGGCTCAC 
CCCATGCTGG 
CACCTGGAAG 
CCCCIGIGAC 
CTACCAGTGC 
CCTGGAATGC 
OGGCTTCCTG 
CTCrCGGGCC 
GGGGGAGTAC 
TGGCCCCACC 
CACCAGGACA 
C6AGGATGAG 
TGTAGGCAGT 
GATGGTCTAC 
GTGCAGCOGG 
CACCTCTGCC 
TGCCCTGCAG 
CXy^GGTGCAG 
CATTAGCCAG 
CTATGACAAA 
GGTGCTCACA 
CAATGGCATC 
GCTTGCAGGT 
CCAGGACGTG 
ACCCAGCCCG 
GTGTCGGGAT 
ATGTGTGAGC 
GGGCAGCAGC 
TACCTTCTGG 



Seq ID MO: 443 Protein sequence 
Protein Accession ft: Eos sequence 



MPPFLLLEAV 
SVGKGSFERS 
MVFKGGRTET 
FAVGVRFPRH 
PCEHRTLEMV 
SQPCQNGGTC 
RAKVFVKRFV 
LTGSALRQAA 
EAVRASLEEI 
SVGPBNFAQM 
APYLGGVGSA 
SVLWGVGPV 
CMNEGSCVLQ 
RTPPSKYREG 



11 

I 

CVFLFSRVPP 
KHPAITVCDG 
EliALKYLLHR 
EELHALASEP 
REFAGNAPCW 
VPBGLDGVQC 
RAVLSEDSRA 
ERGPGSATRT 
TGSPKHVMVY 
QSFVRSCALQ 
GTALLHIYDK 
LSBGLRRIiAG 
KGSYRCKCRD 
LGTEMVPTFW 



21 
I 

SLPLQEVHVS 
LDISPERVRV 
GLPGGRNASV 
RGQHVLLAEO 
RGSRRTLAVL 
LCPLAFGGEA 
RVGVATYSRE 
GQDRPRRVW 
SDPQDLFNQI 
FEVNPDVTQV 
VMTVQRGARP 
PRDSIiZHVAA 
GWEGPHCEMR 
NVCAPGP 



31 
I 

KETIGKISAA 
GAFQPSSTPH 
PQILIIVTDG 
VEOATNGLFS 
AAHCPFYSWK 
MCALKLSIiEC 
LLVAVPVGEY 
LLTESH5EDE 
PELQGKLCSR 
GLWYGSQVQ 
GVPKAWVLT 
YADLRYHQDV 
EHSSCSVCVS 



41 

1 

SKMMWCSAAV 
LEFPLDSFST 
KSQGDVALPS 
TIiSSSAICSS 
RVFLTHPATC 
RVDLLFLLOS 
QDVPDLVH5L 
VAGPARHARA 
QRPGCRTQAL 
TAFGLDTKPT 
GGRGAEDAAV 
LIEHLOGEAK 
QGHIliSTPLR 



Seq ID NO: 444 DNA sequence 

tnideic Acid Accession ft: Eos sequence 

Coding sequence t 89.. 2 3 56 



1 

I 

GCCCCCTGGC 
GTCGCCGCTC 
TGTTTTCCIG 
AGAAAC CATC 
CATCATGTTT 
GCACTTTGCC 
AGCATTCCAG 
ACAGGAAGTG 
ACTTGCTCTG 
CCAGATCCTC 



11 

I 

CCGAGCCGCG 
TCCTTCOGTT 
TTTTCCAGAG 
GGGAAGATTT 
CTGTTAGATG 
ATCACAGTCT 
TTCAGTTCCA 
AAGGCAAGAA 
AAATACCTTC 
ATCATCGTCA 



21 

I 

CCCGGGTCTG 
ATATCAACAT 
TGCCCCCATC 
CAGCTGCCAG 
GGTCTAACAG 
GTGACGGTCT 
CTCCTCATCT 
TCAAGAGGAT 
TGCACAGAGG 
CTGATGGGAA 



31 
I 

TGAGTAGAGC 
GCCCCCTTTC 
TCTCCCTCTC 
CAAAATGATG 
CGTCQGGAAA 
GGACATCAGC 
GGAATTCCCC 
GGTTTTCAAA 
GTTGCCT6GA 
GTCCCAGGGG 



41 
I 

CGCCCGGGCA 
CTGrrGCTGG 
CAGGAAGTCC 
TGGTGCIOGG 
GGGAGCTTT6 
CCOGAGAGGG 
TTGGATTCAT 
GGAGGGCGCA 
GGCAGAAATG 
GATGTGGCAC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140' 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 



51 
1 

DIMFXiLDGSM 
QQEVXARIKR 
KQLKERGVTV 
ATPDCRVEAH 
YRTTCPGPCD 
SAGTTLDGFL 
0GIFFRG6PT 
RELLLLGVGS 
DLVFMIiDTSA 
RAAMLRAISQ 
PAQXLRKNGI 
QPVKLCKPSP 
HHAFVQB6SS 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



51 
I 

CCGAGCGCTG 
AAGCCGTCTG 
ATGTAAGCAA 
CTGCAGTGGA 
AAAGGTCCAA 
TCAGAGTGGG 
TTTCAACCCA 
OGGAGACGGA 
CTTCTGTGCC 
TGCCATCCAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



351 



wo 02/086443 

GCAGCTGAAG GAAAGGGGTG TavCTGTGTT TGCTGTGGGG GTCAGGTTTC CC3\GGTGGGA 660 

GGAGCIGCAT GCACTGGCCA GCGAGCCTAG AGGGCAGCAC GTGCTGTTGG CTGAGCAGGT 720 

OGAGGATGCC AOCAAOQGCC TCTTCAGCAC CCTCAGCAGC TCGGCCATCT GCTCCAGOGC 780 

CAOGCCAGAC TOCAGGGTOG AGGCTCAOOC CIGTGAGCAC AGGAO GCTGG AGATGCTCOG 840 

GGAGrrCGCT (3GCAATGCOC GATGCTGGAC AGGATCGCGG CGGACCCTTG OGGTGCTGGC 900 

TGCACACTCT CCCTTCTACA GCTGC3AAGAG ACTGTTCCTA ACCCACCXTTG CCAOCTGCTA 960 

CAGGACCACC TGCCCAGGCX: CCTGTGACTC GCAGCCCTGC CACAATGGAG GCACATGTGT 1020 

TCCAGAAGGA CIGGACGGCT ACCACTGCCT CTGCCCGCTG GCVITIGGAG GGGAGGCTAA 1080 

CiViX X XX' TC AAGCK3AGCC TGGAATGCAG GGTCGACCTC CTCTTCCTGC T GGACAGC TC 1140 

TCOGGGCACC ACTCPGGAOG GCTTCCTGOG GGCCAAACTC TTCGTGAAGC GGTTTGTGCG 1200 

GGCCGTGCTG AGCGAGGACT CTCGGGCCOG AGTGGGTGTG GGCACA3ACA GCAGGGAGCT 1260 

GCTGGTGGOG GTGCCTGTGG GGGAGTACCA GGAtGTGCCT 6ACCTGGTCT GGAGCCTOGA 1320 

TGGCATTCCC TTCCGTGGTG GCCCCACCCT GAOSGGCAGT GCCTTGCGGC AG GOGG CAGA 1380 

GCGTGGCTTC GGGAGCGCCA CCAGGACAGG CCAGGACCGG CCACGTAGAG TGGTGGTTTT 1440 

GCTCACTGAG TCACACTCCG AGGATGAGGT TGOGGGCCCA GOGOGTCACG CAAGGGCGOG 1500 

AGAGCTGCTC CTGCTGGCTG TAGGCAGTGA GQOOGTGOGG GCAGAGCTGG AGGAGATCAC 1560 

AGGCAGCCCA AAGCATCTGA TGGTCTACTC GGATCCTCAG GATCTGTTCA ACCAAATCCC 1620 

TGAGCrcCAG GGGAAGCTGT GCAGCOSGCA GCGGCCAGGG TGCOSGACAC AAGCCCTGGA 1680 

CCTOGTCTTC ATGTTGGACA CCTCTGCCTC AGTAGGGCCC GAGAATTTTG CTCAGATGCA 1740 

GAGCrrroTG AGAAGCTCTG CCCTCCAGTT TGAGGTGAAC CCTGACGTGA CACAGGTCCG 1800 

C CltajfGGT G TATGGCAGCC AGGTCCAGAC TGCCTTOGGG CTGGACACCA AACCCACCCG 1860 

GGCroOGATC CroOGGGCCA TTAGCCAGGC CCCCTACCTA GGTGGGGTGG GCTCAGCCGG 1920 

CACOGCCCTG CTGCACATCT ATGACAAAGT GATGACOGTC CAGAGGGGTO CCCG6CCTGG 1980 

TGTCCCCAAA GCTGTGGTGG TGCTCACAGG GGG6A6A0GC GCAGAGGATG CAGCX?6TTCC 2040 

TGCXX»GAAG CTGAGGAACA ATGGCATCTC TGTCTTGGTX: GT0GGCX5TGG GGCCT6TCCT 2100 

AAGTGAGGGT CTGCGGAGGC TTGCAGGTCC CCGGGATTCC CTGATCCACG TG6CAGCTTA 2160 

CGCOGACCTG CGGTACCACC AGGACGTGCT CATTGAGTGG CTGTGTGGAG AAGCCAAGCA 2220 

GCCAGTCAAC CTCTGCAAAC CCAGCCCGTG CATGAATGAG GGCAGCTGCG TCCTGCAGAA 2280 

TGGGAGCTAC CGCTGCAAGT GTOGGGATGG CTGGGAGGGC C0CCACTG06 AGAACCGATT 2340 

CTTGAGAC3GC CCCTGAGGCA CATGGCTCCC GTGCAG6AGG GCAGCAGCOG TACCCCTCCC 2400 

AGCAACTACA GAGAAGGCCT GGGCACTGAA ATGCrXSCXTTA CCTTCTGGAA TGTCTGTGCC 2460 

CCAGGTCCTT AGAATGTCTG CTTCCGGCaS TGGCCAGGAC CACTATTCTC ACTGAGGGAG 2520 

GAGGATCTCX: CAACTGCAGC CATCCTGCTT AGAGACAAGA AAGCAGCTGA TGTCACCCAC 2580 

AAAOGAIGTT GTTGAAAAGT TTTGATGTGT AAGTAAATAC CCACTTTCTG TACCTGCTGT 2640 

GCCTTGTTGA GGCTATGTCA TCTGCCACCT TTCCCTTGAG GATAAACAAG GGGTCCTGAA 2700 

GACTTAAATT TAGCCGCCTG ACGTTCCTTT GCSVCACAATC AATG CTCGCC AGAATGTTGT 2760 
IGACACAGTA ATGCCCAGCA GAGGCCTTTA CTAGAGCATC CrrTGGACGG 

Seq ID HO: 445 Protein sequence 
protein Accession ft: Eos sequence 

1 11 21 31 41 51 

1 I I I I I 

MPPPLLLEAV CVFLPSRVPP SLPLQBVHVS KETIGKISAA SKMMWCSAAV DIMPLLDGSN 60 

SVGKGSFERS KHPAITVCDG IJ)ISPERVRV GAFQFSSTPH LEFPLDSPST QQEVKARIKR 120 

MVPKGGRTET ELALKYLLHR GLPGGRNASV PQILIIVTDG KSQGDVALPS KQLKERGVTV 180 

PAVGVRFPRH EELHALASEP RGQHVLLAEQ VEDATNGLPS TLSSSAICSS ATPOCRVEAH 240 

PCEHRTLEMV RBFAGNAPCW RGSRHTLAVL AAHCPPYSWK RVFLTHPATC YRTTCPGPCD 300 

SQPCQNGGTC VPEGLDGYQC LCPLAFGGBA NCALKLSLEC RVDLLPLLDS SAGTTLDGFL 360 

RAKVFVKRFV RAVLSEDSRA RVGVATYSRB LLVAVPVGEY QDVPDLVWSL DGIPPRGGPT 420 

LTGSALRQAA ERGFGSATRT GQDRPRRVW LLTESHSEDE VAGPARHASA RELLZiLGVGS 480 

EAVRAELEEI TGSPKHVMVY SDPQDLPNQI PELQGKLCSR QRPGCRTQAL DLVFMU3TSA 540 

SVGPBNFAQM QSFVRSCALQ FBVNPOVTQV GLWYGSQVQ TAPGLDTKPT RAAMLRAISQ 600 

APYLG6VGSA GTALLHIYDK VMTVQRGARP GVPKAWVI.T GGRGAEDAAV PAQKLRNNGI 660 

SVLWGVGPV LSBGLRRLAG PBDSLIHVAA YADLRYHQDV LIEHLOGEAK QPVNUaCPSP 720 
OflNEGSCVLQ NGSYRCKCRD GWEGPHCENR FLRRP 

Seq ID KO: 446 DNA sequence 

Kucleic Acid Accession ft: NM_03 1942.1 

Coding sequence: 14S..1260 

1 11 21 31 41 SI 

CCCGAGCCCC GCCCCTCCGG GCCCGGGTCG GCGCGCCCAG CCTGCCAGCC GCGCTGCTGC 60 

TGCTCCTCCT GCTGTCGGAC CGCTGACOGC GCGGCTGCTC CGCTCTCCCC GCTCCAAGCG 120 

CCGATCTGGG CACCOGCCAC CAGCATGGAC GCTCGCCGCG TGCOGCAGAA AGATCTCAGA 180 

GTAAAGAAGA ACTTAAAGAA ATTCAGAXAT GTGAAGTTGA TTTCCATGGA AACCTCGTCA 240 

TCCTCTGATG ACAGTTGTGA CAGCTTTGCT TCTGATAATT TTGCAAACAC GAGGCTGCAG 300 

TCAGTTOGGG AAGGCTGTAG GACOCGCAGC CAGTGCAGGC ACTCTGGACC TCTCAGGGTG 360 

GCGATGAAGT TTCCAGCGOG GAGTACCAGG GGAGCAACCA ACAAAAAAGC AGA6TCCCGC 420 

CAGCCCTCAG AGAATTCTGT GACTGATTCC AACTCOGATT CAGAAGATGA AAGTGGAATG 480 

AATTTTTTGG AGAAAAGGGC TTTAAATATA AAGC3UUVACA AAGCAATGCT TGCAAAACTC 540 

ATGTCTGAAT TAGAAAGCTT CCCTGGCTCG TTCCGTGGAA GACATCCCCT CCCAGGCTCC 600 

GACrCACAAT CAAGGAGACC GCGAAGGCGT ACATTCCOSG GTGTTGCTTC CAGGAGAAAC 660 

CCIGAACGGA GAGCTGGTCC TCTTACCAGG TCAAGGTCCC GGATCXTTOGG GTCCXmXSAC 720 

GCTCTACOCA T6GAGQAGGA GGAOGAAGAO GATAAGTACA TGTTGGTGAG AAAGAGGAAG 760 

ACCGTGGATG GCTACATGAA TGAAGATGAC CTGCOCAGAA GCCXSTOGCTC CAGATCATCC 840 

GTGACCCTTC CGCATATAAT TCGCCCAGTG GAAGAAATTA CAGAGGAGGA GTTGGAGAAC 900 

GTCTGCAGCA ATTCTOGAGA GAAGATATAT AACCGTTCAC TGGGCTCTAC TTGTCATCAA 960 

TGCCGTCAGA AGACTATTGA TACCAAAACA AACTGCAGAA ACCCAGACTG CTGGGGCGTT 1020 

CGAGGCCAGT TCTGTGGCCC CTGCXTTTOGA AACOGTTATG GTGAAGAGGT CAGGGATGCT 1080 

CTGCTGGATC CGAACTGGCA TTGCCCGCCT TGTCGAGGAA TCTGCAACTG CAGTTTCTGC 1140 

CGGCAGCGAG ATCGACGGTG TGCGACTGGG GTCCTTGTGT ATTTAGCCAA ATATCATGGC 1200 

TTTGGGAATG TGCATGCCTA CTTGAAAAGC CTGAAACAGG AATTTGAAAT GGAAGCATAA 1260 

TATCTCGAAA ATTTGCTGCC TGCCTPCTAC TTCTCAAATC TTTCTTGTAA AAGTTTOCAA 1320 

TTTTTTCACT GAAAOCTGAG TTAAAAATCT TGATGATCAG CCTGTTTCAT AAGAAACTCC 1380 

AATCAAGTTA ATCTTAfiCAG ACATGTGTrp CTGGAGCATC ACAGAAGGTA TATTGCTAGT 1440 
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TACACTTTGC CCTCCTGCAG TTTCTTCTCT GCTCCCAACC CCCATCTCAT AGCATCCCOC ISOO 

TCTATTTCCA ATGCTOCTCT CCAACCGCTT AGTTTCTGAA TTTCTTTTAA ATTACAGTTT 1560 

TATGAAAOCA XATTTTATTT ACf T GGIGIT GAAATAGCGC TCATAAAACC TAAGCACTTG 1620 

GAAACACAAT AATACTATTA ACTAACTAGA TCTATTGAAT TTCAGAGAAG ACCCTTCTAA 1680 

CTTCTTTACA CAAAAACGAG TATGATTTAG CACTCATACT AGTTGAAATT TTTAATAGAA 1740 

TCAAGGCACA AAAGTCTTAA AACCATGTGG AAAAATTAGG TAATT ATTG C AGATTGATGT 1800 

CTCTCAATCC CATGTATTGC GCTTATGTTA aVAGTTGTTG TCACAGTTGA GACTTAATTT I860 

CTCCTAATTT CTTCTGOCOG AAGGGTAAGT GGTGOGTCCA GCTTACACGA TCATAATTCA 1920 

AAOGTTGGTC GGCAATGTAA TACTTAATTA AAATAATGAT GGAAGAGCTA TCTGGAGATT 1980 

ATGAGTAAGC TGATTTGAAT TTTCAGTATA AAACTTTAGT ATAATTGTAG TTTGCAAAGT 2040 

TTATTTCAGT TCACATGTAA GGTATTGCAA ATAAATTCTT OGACAATTTT GTATGGAAAC 2100 

TTOATATTAA AAACTAGTCT GTGGTTCTTT GCAGTTTCTT GTAAATTTAT AAACC AGGCA 2160 

CAAfiGTTCAA GTTTAGATTT TAAGCACTTT TATAACAATG ATAAGTGCCT TTTTGGAGAT 2220 

GTAACTTTTA GCAGTTrGTT AACCTGACAT CTCTGCCAGT CTAGTTTCTG GGCAGGTTTC 2280 

CTGTGTCAGT ATTCCCCCTC CTCTTTGCAT TAATCAAGGT ATTTGGTAGA GGTGG AATCT 2340 

AAGTGTTTGT ATGTCCAATT TACTTGCATA TGTAAACCAT «CTGTGCCA TTCAATGTTT 2400 

GATGCATAAT TGGACCTTGA ATCGATAAGT GTAAATACAG CTTTTGATCT GTAATGCTTT 2460 
TATACAAAAG TTTATTTTAA TAATAAAATG TTTGTTCTAA AAAAAAAAAA 

Seq ID NO: 447 Pro&ein sequence 
Protein Accession S: NP_ll414e.l 

1 11 21 31 41 51 

MDARRVPQKD LvKKmiKKP RWKLISMST SSSSOOSCDS FASDOTANTO UJSVREGCRT 60 

RSQCBHSGPL RVAMKFPARS TRGATMKKAE SRQPSENSVT DSNSDSEDES GMNFLEKRAL 120 

NIKQNKAMLA KLMSELESFP GSFRGRHPLP GSDSQSRRPR RHTFPGVASR RNPERRARPL 180 

TRSRSRILGS LDALPMBEEE EEDKyMLVRK RKTV33GYMIIE ODLPRSRRSR SSVTLPHIIR 240 

PVEEITBEBI. BNVCSNSREK lYNRSLGSTC HQCRQKTIDT KTNCRMPDCW GVRGQFOGPC 300 

LRNRYGEEVR DALUJPNWHC PPCRGICMCS FC31QRDGRCA TGVLVYLATOf HGFGHVHAlfL 360 
KSLKQEFEMQ A 

Seq ID NO J 448 DNA sequence 
Nucleic Acid Accession # t NM_019894 
Coding sequence : 1 . . 1314 

1 11 21 |1 1^ 

ATGTTACAGG ATCCTGACAG TGATCAACCT CTGAACAGCC TCGATGTCAA ACCCCTGCGC 60 

AAACCCCGTA TCCCCATGGA GACCTTCAGA AAC3GTGGGGA TCCCCATCAT CATAGCACTA 120 

CTGAGCCTGG CGAGTATCAT CATTGTGGTT GTCCTCATCA AGGTGATTCT GGATAAATAC 180 

TACTTCCTCT GCGGGCAGCC TCTCCACTTC ATCOCGAGGA AGCAGCTGTG TGACGGAGAG 240 

CTGGACTGTC CCTTGGGGGA GGACGAGGAG CACTGTGTCA AGAGCTTCCC OGAAGGGCCT 300 

GCAGTGGCAG TCCX3CCTCTC CAAGGACCGA TCCACACTGC AGGTGCTGGA CTCGGCCACA 360 

GGGAACTGGT TCTCTGCCTG TTTCGACAAC TTCACAGAAG CTCTCGCTGA GACAGCCTGT 420 

AGGCAGATCG GCTACAGCAG CAAACCCACT TTa«5AGCTG TGGAGATTGG CCCAGACCAG 480 

GATCTGGATG TTGTTGAAAT CACAGAAAAC AGCCAGGAGC TTCGCATGCG GAACTCAAGT 540 

GGGCCCTGTC TCTCAGGCTC CCTGGTCTCC CTGCACTGTC TTGCCTGTGG GAAGAGCCTG 600 

AAGACCCCCC GTGTGGTGGG TGGGGAGGAG GCCTCTGTGG ATTCTTGGCC TTGGCAGGTC 660 

AGCATCCAGT ACGACAAACA GCACGTCTGT GGAGGGAGCA TCCXGGACCC OCACTGGGTC 720 

CTCACGGCAG CCCACTGCTT CAGGAAACAT ACCGATGTGT TCAACTGGAA GGTGOGGGCA 780 

GGCTCAGACA AACTGGGCAG CTTCCCATCC CTGGCTGTGG CX»AGATCAT CATCATTGAA 840 

TTCAACCCCA TGTACCCCAA AGACAATGAC ATCGCCCTCA TGAAGCTGCA GTTCCCACTC 900 

ACTTTCTCAG GCACAGTCAG GCCCA1CTGT CTGCCCTTCT TTGATGAGGA GCTCACTCCA 960 

GCCACCCCAC TCTGGATCAT TGGATG6GGC TTTAOGAAGC AGAATGGAGG GAAGATGTCT 1020 

GACATACTGC TGCAGGCGTC AGTCCAGGTC ATTGACAGCA CAOGGTGCAA TGCAGACGAT 1080 

GCGTACCAGG GGGAAGTCAC CGAGAAGATG ATGTGTGCAG GCATCCCGGA AGGGGGTGTG 1140 

GACACCTCCC AGGGTGACAG TGGTGGGCCC CTGATGTACC AATCTGACCA GTGGCATGTG 1200 

GTGGGCATCG TTAGCTGGGG CTATGGCTGC GGGGGCCCGA GCACCCCAGG AGTATACACC 1260 
AAGGTCTCAG CCTATCTCAA CTGGATCTAC AATGTCTGGA AGGCTGAGCT GTAA 

Seq ID NO: 449 Protein sequence 
Protein Accession #: IIP_063947.1 

1 11 21 31 41 51 

MLQDPDSDQP LsLDVKPLR KPRIPMETFR KVGIPIIIAL I.SLASIIIW VLIKVILDKY 60 

YFLCCQPLHP IPRKQLCDGE LDCPLGEDEB HCVKSFPBGP AVAVRLSKDR STWVLDSAT 120 

GNWFSACFDN FTEALAETAC RQMGYSSKPT FRAVEIGPDQ DLDWEITEN SQELRMRNSS 180 

GPCLSGSLVS LHCLACGKSL KTPRWGGEE ASVDSHPWQV SIQYDKQHVC GGSILDPHWV 240 

LTAAHCFRKH TDVFNWKVRA GSDKLGSFPS LAVAKIIIIE PNPMYPKDND lALMKLQFPL 300 

TFSGTVRPIC LPFFDEEIjTP ATPLWIIGWG FTXQNGGKMS DILU2ASVQV IDSTRC2JADD 360 

AYQGEVTEKM MCAGIPBGGV DTCQGDSGGP LMYQSDQWHV VGIVSWGYGC GGPSTPGVYT 420 



Seq ID NO: 450 DNA sequence 

Nucleic Acid Accession ft: XM_OS1860.2 

Coding sequence: 52.. 3042 

X 11 21 31 41 SI 

GCTCACCCAG GAAAAATATG CAATCGTCCC ATTGATATAC AGGCCACTAC AATGGATGGA 60 

GTTAACCrCA GCACCGAGGT TGTCTACAAA AAAGGCCAGG ATTATAGGTT TGCTTGCTAC 120 

GACCGGGGCA GAGCCTGCCG GAGCTACCGT GTACGGTTCC TCTGTGGGAA GCCTGTGAGG 180 

CCCAAACTCA CAGTCACCAT TGACACCAAT GTGAACAGCA CCATTCTGAA CTTGGAGGAT 240 

AATGTACAGT CATGGAAACC TGGAGATACC CTGGTCATTG CCAGTACTOA TTACTCCATG 300 

TACCftGGCAG AAGACTTCCA GGTGCTTCCC TGCAGATCCT GOGCCCCCRA CCAGGTCAAA 360 
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GTGGCAGGGA AACCAATGTA 0CT6CACATC 

GCGGAGGTTG GGCTTCTGAG CCGGAACATC 
TACCCCTACA GAAACCACAT CTGCAATTTC 
AAGTTTGCTC TCSGGATTTAA GGCAGCACAC 
5 CAGCW3CTGG TGGGTCAGTA CCCGATTCAC 
GGAGGTTATG ACCCACCCAC ATACAT CAGG 
TGCGTCACAG TOCATGCCTC CAATGGCTTG 
TTGGGCCACT GCTTCTTCAC GGAAGKTGG6 
CTTGGCCTCC TTGTCAAGTC TGGAACCCTC 
10 AAGATCATCA CAGGAGACTC CTACCCAGGG 
GCTCTGTCCA CCTTCTGGAT GGC CAAT CCC 
GGATCTGAGG AAACTGGATT TTGGTTTATT 
C3GAATGTACT CCOCAGGTTA rrCAGAGCAC 
GCACATTCCA ACTACOGGGC TGGCATGATC 
15 TCTGCCAAGG ACAAGOGGCC GTTCCTCTCA 
GACGCCGACC CGCTGAAGCC CCGGGAGCCG 
AACCAGGACC AOGGGGCCTG GCTGCGCGGC 
GCTGACAATG GCATTGGCCT GACCCTGGCC 
TCCAAGCAAG AC5ATAAAGAA CAGCTTGTTT 
20 ATGATGGACA ATAGGATCTG GGGCCCTGGC 
ATAGGCCAGA ATTTTCCAAT TAGAGGAATT 
AACTGCACTT TCOGAAAGTT TGTGGCCCTG 
CGCCTCAATA ATGCCTGGCA GAGCTGCCCX: 
GAOGTTCCGA TTACTTCCAG AGTGTTCTTC 
25 GACATGGATG GGGATAAGAC ATCTGTGTTC 
CCTGGCTCCT AOCTCAOGAA GA ATGA CAAC 
GTTCCOGACT GGAGAGGGGC CATTTGCAGT 
TACAAGACCA GTAACCTGOG AATGAAGATC 
TACCTGGAGG GGGOGCTCAC CAGGAGCACC 
30 CTGCAGAAGG GCTACACCAT CCACTGGGAC 
CTCATCAACT TCAACAAGGG OGACTGGATC 
ACATTCTCCA T C CTCTOGGA TGTTCACAAT 
GTCTTOGTGA GGACCTTGCA GATGGACAAA 
TACTACTGGG ACGAGGACTC AGGGCTGTTG 
35 GAGAAGTTTG CTTTCTGCTC CATGAAAGGC 
CCAAAGAAGG CAGGCGTCAG TGACTGCACA 
GCTGTCGTAG ACGTGCCGAT GCCCAAGAAG 
CATTTCTTGG AGGTCyVAGAT GGAGAGTTCC 
rrcCCTTACA TTGAAGTGGA TGGGAAGAAG 
40 GTGGTGATTG AOGGGAACCA AGGGOGCGTG 
CTGCAAGGCA TACCATGGCA GCTTTTCAAC 
GTGCTTATGG CATCAAAGGG AAGATACGTC 
AAGCTTGGGG CAGACAGGGG TCTCAAGTTG 
GGCAGCTTOC GGCCCATCTG GGTGACACTG 
45 CAAGTTGTGC CCATCOCTGT GGTGAAGAAG 
GCCACCTOGT GGTAGACTAT GACGGTGACT 
GTCCCCCAGC CCCTGCCAGC AGCTGOCTGG 
GGGAAGGCTA TCAGAGACCC TGGTGCTGCC 
CCCCTGGGGC GGTGCTGGCC AATGCTGGAA 

50 TTCTCTCCTA tctgtgcctc ttcagtgggg 
gtgctgacag caaagatcca ctttggcagg 
gggctggtca ttcacagatc cccatggtct 
gagaaagagc cttggcctta aggaaatctt 
gattaggagc tggggtagaa ctggctatcc 

55 gtgtccacct ttcaggagac tttgagtggc 
aggccctttt agttctgaga ttccagaaat 

AACAGTTCAT GGATATCCAC TGATATiXAT 
GAGAGGTGAG AACTAATGCC TAGCTTGAGG 
GTCCATGTGC ACTGCAATGC CAGGTGGAGA 

60 CCATTTCAGA GGGGAGGCTC AGGAAGGCTT 
TTTGCTGGGG GGAGATGAGG CAGCCTCTGG 
CTGCCTGCTG AAGCTGGTGA CTACGGGGTC 
ATGATGGAGA AGTGTCGTCA GAGGGGAGCA 
ATTCAGTCCC CAGGCAGCCC TGCCTCTGAC 

65 TCCXGCCTTA GGGOCTCATT TGCTCTTCAT 
AGACCCTAGA TGTGCTCGTA CTCCCTCGGC 
TATCTAGCCC AAAGCCTTCA TTTTAACAGA 
AACCACACAG CTAAGGGAGG GCCTGGGGAG 
TTGCCTCAAC AACCGGCCCC AGAGTGCCXZA 

70 GACAAGTCCC CTCQAAGGAA AGGAAATGAC 
CCCTCCTGCT CCCAGCGCAC ACAAACCCGC 
CTTCACTTTG TTCACTACCT GTCAGCCCAG 
TGGTCCTACC TGGCTCTCCT GTCTCTGCAG 
GGGCTOGCCA TGTTTCTGGT GAGCCAATTT 

75 GGTCCACCCC AGTCCCTTTC AGCTGCTGCT 
ATAGAGAGCC CAAAGAGCTC CTGTAAGAGG 
AGGAGGCACC AGAGTCTCCC TGGGTCTTGT 
CAACCACAAA CTCTTTCCTT CAAAGAGGGC 
AT6AGACTOG GTCCAAGAGT OCATTCCOCA 

80 cx:accaaaca tctttcagct gctgggaggt 

CTGCTTCAAA GGCCAGAGTC ACAGGAAGGA 

ag6agagtta aaatgacctc atgtccttct 
ctaatgcaag ggtctcacac tgtgaaccac 
aatgttgaat gtctttggct cagttcattt 
85 gttgtacata tgtttcacao tacagg atct 
aocaagagcc aatatctagg cattttcttg 

TTGTCCTCCT TGTTATTTCT GTTTGTAAGA 



GGGGAGGMSV TAGACGGOGT GGACATGOGG 420 

ATAGTGATGG GGGAGATGGA GGACAAATGC 480 

TTPGACrrCG ATAOCTTTQG GGGO CACATC 540 

TTGGAGGGCA OGGAGCTOAA GCATATGGGA €00 

TTCCACCTGG CCGGTGATGT AGACGAAAGG 660 

GACCTCrCCA TCCATCATAC ATTCTCTOGC 720 

TTGATCAAGG AOGTTGTGGG CTATA ACTCT 780 

COGGAGGAAC GCAACACTTT TGAOCACIGT 840 

CTCCCCTOGG ACOGTGACAG CAAGATGTGC 900 

TACATCCCCA AGCCCAGGCA AGACTGCAAT 960 

AACAACAACC TCATCAACTG TGCOGCTGCA 1020 

TTTCA0CA06 TACCAACGGG CCCCTCOGTG 1080 

ATTOCACTGG GAAAATTCTA TAACAACCGA 1140 

ATAGACAACG GACTCAAAAC CACCGAGGCC 1200 

ATCATCTCTG CCAGATACAG CCCTCACCAG 1260 

GCCATCATCA GACACTTCAT TGC CTACftflG 1320 

GGGGATGTGT GGCTGGACAG CTGCOGGTTT 1380 

AGTGGTGGAA OCTTCCCGTA TGAOGACGGC 1440 

GTTGGCGAGA GTGGCAACGT GGGGACGGAA ISOO 

GGCTTGGACC ATAGOGGAAG GACCCTCCCT 1560 

CAGTTATATG ATQGCCCCAT CAACATCCAA 1620 

GAGGGCCGGC ACACCAGCXX: CCTCGCCTTC 1680 

CATAACAACG TGACXXX3CAT TGCCTTTGAG 1740 

GGAGAGCCTG GGCCCTGGTT CAACCAGCTG 1800 

CATGACGTCG AOGGCTCCGT GTCCGAGTAC 1860 

TGGCTGGTCC GGCAOCCAGA CTGCATCAAT 1920 

GGGTGCTATG CACAGATGTA CATTCAAGCC 1980 

ATCAAGAATG ACTTCCCCAG CCACCCTCTT 2040 

CATTACCAGC AATACCAACC GGTTGTCACC 2100 

CAGAOGGCCC CGGCOGAACT 0QCCATCTG6 2160 

CGAGTGGGGC TCTGCTAOCC GOGAGGCACC 2220 

CGCCTGCTGA AGCAAAOGTC CAAGACGGGC 2280 

GTGGAGCAGA GCTACCCTGG CAGGAGCCAC 2340 

TTCCTGAAGC TGAAAGCTCA GAAGGAGAGA 2400 

TGTGAGAG6A TAAAGATTAA AGCTCTGATT 2460 

GCCACAGCTT ACCCCAAGTT CACCGA6AGG 2520 

CTCTTTCGTT CTCAGCTGAA AACAAAGGAC 2580 

AAGCAGCACT TCTTCCACCT CTGGAACGAC 2640 

TACCCCAGTT OGGAGGATGG CATCCAGGTG 2700 

GTGAGOCACA 06AGCTTCAG GAACTCCATT 2760 

TATGTGGCGA CCATCCCTGA CAATTCCATA 2820 

TCCAGAGGCC CA1GGACCAG AGTGCTGGAA 2880 

AAAGAGCAAA TGGCATTCX5T TGGCTTCAAA 2940 

GACACTGAGG ATCACAAAGC CAAAATCTTC 3000 

AAGAAiGTTGT GAGGACAGCT GCCGCCCX3GT 3060 

CTTGGCAGCA GACCAGTGGG GGATGGCTGG 3120 

GAAGGOOSTG TTTCAGCCCT GATGGGCCAA 3180 

ACCTGCCOCT ACTCAAGTOT CTACCTGOAG 3240 

ACATTCACTT TCCTGCAGCC TCTTGGGTGC 3300 

GTTTGGGGAC CATATCAGGA GACCTGGGTT 3360 

AGCCCTGACC CAGCTAGGAG GTAGTCTGGA 3420 

TCAGCAGACA AGTGAGGGTG GTAAATGTAG 3480 

TACTCCTGTA AGCAAGAGCC AACCTCACAG 3540 

TTGGGGAAGA GGCAAGCCCT GCCTCTGGCC 3600 

AGGTTTGGAC TTGGACTAGA TGACTCTCAA 3660 

CTGCTGCATT TCACATGGTA CCTGGAACCC 3720 

GATCCTGGGT GCCCCAGCGC ACACGGGATG 3780 

GGTCTGCAGT CCAGTAGGGC AGGCAGTCAG 3840 

AATCACAGAG AGGTAAAATG GAGGCCAGTG 3900 

CTTGCTTACA GGAATGAAGG CTGGG6GCAT 3960 

AATGGCTCAG GGATTCAGCC CTCCCTGCOG 4020 

GCCCTTTGCT CACGTCTCTC TGGCCCACTC 4080 

ATGGGCTTTG CTGCTTATGA GCACAGAGGA 4140 

TCCAAGAGGG TCAAGTCCAC AGAAGTGAGC 4200 

CXSUjGGAACT GAGCACAGGG GGCCTCCAGG 4260 

CTGGGATTTC AGAGCTGGAA ATATAGAAAA 4320 

TGGGGAAAGT GAGCCCCCAA GATGGGAAAG 4380 

CCCCACCCTA GCCCTTGCTG CCACACCACA 4440 

GGCACTCCTG AGGTAGCTTC TGGAAATGGG 4500 

TAGAGTAGAA TGACAGCTAG CAGATCTCTT 4560 

CCTCCCCTTG GTGTTGGCGG TCCCTGTGGC 4620 

CCTGGGTGCA CAGTAGCTGC AACTCCCCAT 4680 

CTCTACABGT GAGGCCCAGC AGAGGGAGTA 4740 

GGCTGATCTT GGGTX5TCTGA ACAGCTATTG 4800 

TAATGCCCTG CTCTCTCCCT GGCCCACCTT 4860 

GAGAACTCTA TCTGTGGTTT ATAATCTTGC 4920 

GATGAACTAC ATTTATCCCC TTTCCTGCCC 4980 

CTGGCTGGCr CCCTCCACCC AACTGCACCC 5040 

GGTGGGAGCC AACTGTCAGG GAGGTCTTTC 5100 

GACCATAGGG CTCTGCTTTT AAAGATATGG 5160 

CTTCTTCCAG GGAGATTAGT GGTGATGGAG 5220 

TGTCCACGGT TTTGTTGAGT TTTCACTCTT 5280 

TTAGGATGTG ATCACTTTCA GGTGGCCAGG 5340 
AAAAAAGATA TCTATTTGAA AGriCTCAGA 5400 
GTACATAAAA GTTTCTTTCC TAAACCATTC 5460 
GTAGC31CAAA TTTTCTTATT GCTTAGAAAA SS20 
CTTAAGTGAG TTAGGTCTTT AA0GAAA6CA S580 
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ACGCTCCTCT GAAATGCTTG TCTTTTrTCT GTTGCOGAAA TAGCTGC3TCC TTTTTCGGGA 5640 

GTTAGAIGTA TAGAGTGTTT GTATGiaUlAC ArrTCTTGTA GGCATCACCA TGAACAAAGA 5700 

TATWrrrrCT ATTTATTTAT TATATGTGCA CTTCAAGAAG TCMTOTCAG AGAAATAAAG S760 
AATTCTCTTA AATGTCAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAA 



Seq ID NO: 451 Protein sequence 
Protein Accession §: XP_051860.2 

1 11 21 31 41 SI 

iLgVNLSTEV VYraGQDYRP ACYDRCERACR SYRVRPLCGK PVRPKLTVTI OTOVKSTIIil 60 

LEDNVQSWKP GDTLVIASTD YSMYQAEEFQ VLPCRSCAPN QVKVAGKPMY LHIGEBIDGV 120 

DMRAEVGLLS HNIIVMGEME DKCYPYRNHI CNFFDFDTFG (KIKFAI/SFK AAHLEGTELK 180 

HMQQQLVGQY PIHFHLAGDV DERGGYDPPT VIRDLSIHHT PSRCVTVHGS NGIilKDWG 240 

YM^aiCPFT EDGPEBRNTF DHCLGIJiVKS CTLLPSDRDS KKCKMITGDS YPGYIPKPaQ 300 

DCMAVSTFWM ANPKNNLINC AAAGSEETGP WFIFHHVPTG PSVCMYSPGY SEHIPLGKFY 360 

NNRAHSNYRA (MIIDNGVKT TEASAKDKRP FLSIISARYS PHQDADPLKP RBPAIIRHPI 420 

AYKNQDHGAW LRGGDVWLDS CRFADNGIGL TLASGGTFPY DDGSKQBIKM SLPVGBSGHV 460 

GTEMMDNRIH GPGGU3HSGR TLPIGQNFPI RGIQLYDGPI KIQNCTFRKF VAI.EGRHTSA 540 

LAPRUWAWQ SCPHNNVTCI AFEDVPITSR VFFGEPGFrfF NQLDMDCTKT SVFHDVDGSV 600 

SBYPGSYLTK NDMWLVRHPD CINVPDWRGA ICSGCYAQMY IQAYKTSNLR MKIIKNDFPS 660 

HPLYLEGAI.T RSTHYQQYQP WTLQKGYTI HWDQTAPAEL AIWLINFMKG DWIRVGLCYP 720 

RGTTPSILSD VHNRLLKQTS KTGVFVRTLQ MDKVEQSYPG RSHYYWDEDS GLIiPUCLKAQ 780 

NEREKFAFCS MKGCERIKIK ALIPKNAGVS DCTATAYPKF TERAW0VPM PKKLPGSQUC 840 

TKDHFLEVKM ESSKQHPFHL WNDFAYIBVD GKKYPSSEDG IQVWIEKaJQ GRWSHTSFR 900 

NSIUJGIPKQ LFNYVATIPD NSIVLMASKG RYVSRGPMTR VLOO/SADRG WOiKBQMAFV 960 
GFXGSFRPIH VTUJTEDHKA KIFQWPIPV VKKKKL 

Seq ID MO: 452 DHA sequence 

tTucleic Acid Accession Eos sequence 

Coding sequence t 261.. 2861 

1 11 21 31 41 51 

GAGCTAGCGC TCAAGCAGAG CCCAGCGCGG TGCTATOGGA CAGAGCXrTGG CGAGCGCAAG 60 

CGGOGGGGGG AGCCAGOGGG GCTGAGCGOS GCCAGG6TCT GAACOCAGAT TTCCCAGACT 120 

AGCTACCACT CCGCTTGCCC ACGCCCOGGG AGCTCGCGGC GCCTGGOGGT CAGCGACCAG 180 

ACGTCCGGGG CCGCTGCGCT CCTGGCCCGC GAGGCGTGAC ACTCTCfOGG CTACA6ACCC 240 

AGAGGGAGCA CACTGCCAGG ATGGGAGCTG CTGGGAGGCA GGACTTCCTC TTCAAGGCCA 300 

TGCTCACCAT CAGCTGGCTC ACTCT6ACCT GCTTCCCTGG GGCCACATCC ACAGTGGCTG 360 

CTGGGXGCCC TGACCAGAGC CCTGAGTTGC AACCCTGGAA CCCTGGCCAT GACCAAGACC 420 

ACCATGTGCA TATGGGCCAG GGCAAGACAC TGCIGCTCAC CTCTTCTGCC ACGGTCTATT 4B0 

CCATCCACAT CTCAGAGGGA GGCAAGCTGG TCATTAAAGA CCAOGAOGAG CCGATTGTTT 540 

TGCGAACCCG GCACATCCTG ATTGACAACG GAGGAGRGCT GCATGCTQGG AGTGCCCTCT 600 

GCCCTITCCA GGGCAATTTC ACCATCATTT TGTATGGAAG GGCTGATGAA GGTATTCAGC 660 

CGGATCCTTA CTATGGTCTG AAGTACATTG GGGTTGGTAA AGGAGGOGCT CTTGAGTTGC 720 

ATGGACAGAA AAAGCTCTCC TGGACATTTC TGAACAAGAC CCTTCACCCA GGTGGCATGG 780 

CAGAAGGAGG CTATTTTTTT GAAAGGAGCT GGGGCCACCG TGGAGTTATT GTTCATGTCA 840 

TCGACCCCAA ATCAGGCACA GTCATCCATT CTGACCGGTT TCACROCTAT AGATCCAAGA 900 

AAGAGAGTGA ACGTCTGGTC CAGTATTT6A AOGCGGTGCC CCATG6CAGG ATCCTTTCTG 960 

TTGCAGTGAA TGATGAAGGT TOTCGfiAATC TGGATGACAT GGCCAGGAAG GCGATGACCA 1020 

AATTGGGAAG CAAACACTTC CTGCACCTTG GATTTAGACA CCCTTGGAGT TTTCTAACTG 1080 

TGAAAGGAAA TCCATCATCT TCAGTGGAAG ACCATATTGA ATATCATGGA CATCGAGGCT 1140 

CTGCTGCTGC CCGGGTATTC AAATTGTTCC AGACAGAGCA TGGCGAATAT TTCAATGTTT 1200 

CTTTGTCCAG TGAGTGGGTT CAAGAOGTGG AGTGGACGGA GTGGTTOGAT CATGATAAAG 1260 

TATCTCAGAC TAAAGGTGGG GAGAAAATTT CAGACCTCTG GAAAGCTCAC CCAGGAAAAA 1320 

TATGCAATCG TCCCATTGAT ATACAGGCCA CTACAATGGA TGGAGTTAAC CTCAGCACCG 1380 

AGGTTGTCTA CAAAAAAGGC CAGGATTATA GGTTTGCTTG CTACGACCGG GGCAGAGCCT 1440 

GCC3GGAGCTA OOGTGTAGGG TTCCTCTOTG QGAAGCCTGT GAGGCCCAAA CTCACAGTCA 1500 

CCATTGACAC CAATGTGAAC AGCACCATTC TGAACTTGGA GGATAATGTA CAGTCATGGA 1560 

AACCTCGAGA TACCCTGGTC ATTGCCAGTA CTGATTACTC CATGTACCAG GCAGAAGAGT 1620 

TCCAGGTGCT TCCCTGCAGA TCCTGCGCCC CCAACCAGGT CAAAGTGGCA GGGAAACCAA 1680 

TGTACCroCA CATCGGGGAG GAGATAGACG GCGTGQACAT GCGGGCGGAG GTTGGGCTTC 1740 

TGAGCCGGAA CATCATAGTG ATGGGGGAGA TGGAGGACAA ATGCTACCCC TACAGAAACC 1800 

ACATCTGCAA TTTCTTTGAC TTCGATACCT TTGGGGGCCA CATCAAGTTT GCTCTGGGAT 1860 

TTAAGGCAGC ACACTTGGAG GGCACGGAGC TGAAGCATAT GGGACAGCAG CTGGTGGGTC 1920 

AGTAOCOGAT TCACTTCCAC CTGGCOGGTG ATGTAGACGA AAGGGGAGGT TATGACCCAC 1980 

CCACATACAT CAGGGACCTC TCCATCCATC ATACATTCTC TOQCTGCGTC ACAGTCCATG 2040 

GCTCCAATOG CTTGTTGATC AAGGACGTTG TGGGCTATAA CTCTTTGGGC CA CTGCTT CT 2100 

TCAGGGAAGA TCGGCCXXJAG GAACGCAACA CTTTTGACCA CTGTCTTGGC CTCCTTGTCA 2160 

AGTCTGGAAC CCTCCTCCCC TCGGACCGTG ACAGCAAGAT GTGCAAGATG ATCACAGAGC 2220 

ACTCCTACCC AGGGTACATC CCCAAGCCCA GGCAAGACTG CAATGCTGTG TCXIACCTTCT 2280 

GGATCGCCAA TCCCAACAAC AACCTCATCA ACTGTGCCGC TGCAGGATCT GAGGAAACTG 2340 

GATTTTGGTT TATTTTTCAC CACGTACCAA CGGGCCCCTC C3GTQGGAATG TACTCCCCAG 2400 

GTTATTCAGA GCACATTOCA CTGGGAAAAT TCTATAACAA OCGAGCACAT TCCAACTACC 2460 

GGGCTGGCAT GATCATAGAC AAOGGAGTCA AAACCACOGA GGCCTCTGCC AAGGACAAGC 2520 

GGCOGTTCCT CTCAATCATC TCTGCCAGAT ACAGCCCTCA CCAGGAOGCC GACCCGCTGA 2580 

AGCCCCGGGA GCCGGCCATC ATCAGACACT TCATTGCCTA CAAGAACCAG GACCACGGGG 2640 

CCTGGCrGCG OSGCGGGGAT GTGTGGCrGG ACACCTGCCA TTTCAGAGGG GAGGCtCAGG 2700 

AAGGCTTCTT GCTTACAGGA ATGAAGGCTG GGGGCATTTT GCTGGGGGGA GATGAGGCAG 2760 

CCTCTGGAAT GGCTCAGQGA TTCAGCCCTC CCTGCCGCTG CCTGCTGAAG CTGGTGACTA 2820 

CGGGGTCGCC CTTTGCTCAC GTCTCTCTGG CCCACTCATG ATGGAGAAGT GTGGTCAGAG 2880 

GGGAGCAATG GGCTTTGCTG CTTATGAGCA CAGAGGAATT CACTCCCCAG GCAGCCCT6C 2940 

CTCTGACTCC AAGAQGGT6A AGTCCACAGA AGTGAGCTCC TGQCTTAGGG CCTCATTTGC 3000 

TCTTCATCCA GGGAACIGAG CACAGGGGGC CTCCAGGAGA OCCTAGATGT GCTOGTACTC 3060 

CCTOGGCCTG GGATTTCAGA GCTGGAAATA TAGAAAATAT CTAGOCCAAA GCCTTCATTT 3120 
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TAACAGATGG GGAAAGlGAG COCGCAAGAT GGGAAAiSAAC 
TCSGGGAGCCC CACCCTAGCC CTTGCTGCCA CACCACATTG 
GTGCCCAGGC ACTCCTGAGG TAGCTTCTGG AAATGGGGAC 
AAATGACTAG AGTAGAATGA CAGCTAGCAG ATCTCTTCOC 
AACCOGCCCT CCCCTTGGTG TTGGCGGTCX: CTCXCGCCTT 
AGGGCAGOCT GGGTGCACAjG tAGCTGCAAC TCCO^TTOG 
TCTCCAGCTC TACAGGTGAG GOCCACCACA OGGAGTAGG6 
CCAATTTGGC TGATCTTGGG TGTCTGAACA GCTATTGGGT 
TGCTGCTTAA TGCCCTGCTC TCTCCCTGGC CC ACCTT ATA 
TAAGAGGGAG AACTCTATCT tflXayiT TATA ATCTTGCACG 
GTCnGTGAT GAACTACATT TATCCCCTTT CCTGCCCCAA 
AGAQGGOCTG CCTGGCTCCC TCCACCCAAC TG CACCC ATG 
TTCCCCAGGT GGGAGCCAAC TGTCAGGGAG GTCTTTCCCA 
GGGAGGTGAC CATAGGGCTC TGCTTTTAAA GATATGGCTG 
GGAAGGACTT CTTCCAGGGA GATTAGTGGT GATGGAGAGG 
TCCTTCTTGT CCAOGGTTTT GTTGAGTTTr CACTCTTCTA 
GAACCACTTA GGATGTGATC ACTTTCAGGT GGCCAGGAAT 
TTCATTTAAA AAAGATATCT ATTTGAAAGT TCTCAGACTT 
AQGATCTGTA CATAAAAGTT TCTTTCCTAA ACCATTCACC 
TTTCTTGGTA GCACAAATTT TCTTATTGCT TAGAAAATTG 
TGTAAGACTT AAGTGAGTTA GGTCTTTAAG GAAAGCAACG 
TTrrrCTGTT GCCGAAATAG CTGGTCCTTT TTCGGGAGTT 
TGTAAACATT TCTTGTAGGC ATCACCATGA ACAAAGATAT 
ATGTGCACTT CAAGAAGTCA CTGTCAGAGA AATAAAGAAT 
GAGATGTCCT TTGCATTGCT TGGAAGGGGT GTACCTAGAG 
TTGGAAAAAT TTTGCTGTTA TTATAGTAAA CATACAAAGG 
AAAAAAAAAA AAAAAAAAAA AA 

Seq ID KO: 453 Protein sequence 
Protein Accession 8; Eos sequence 



CAGACAGCTA 
CCrCAACAAC 

AAGTCCCCTC 
TCCTGCTCCC 
CACTTTGTTC 
TGCTACCTGG 
CTOGOCATGT 
CCACCCCAGT 
GAGAGCCCAA 
AGGCACCAGA 
CCACAAACTC 
ACACTOGGTC 
CCAAACATCT 
CTTCAAAGGC 
AGAGTTAAAA 
ATGCAAGGGT 
GTTGAATGTC 
GTACATATGT 
AAGAGCCAAT 
TCCTCCTTGT 
CTCCTCTGAA 
AGATGTATAG 
ATTTTCTATT 
TGTCTTAAAT 
CCAAGGAAAT 
ATGTCAAAAA 



AGGGAGG6CC 
CGGCCCCAGA 
GAAGGAAAGG 
AGOGCACACA 
ACTACCTGTC 
CTCTCCTGTC 
TTCTGGTGAG 
CCCTTTCAGC 
AGAGCTOCTG 
GTCTCCCTGG 
TTTCCTTCAA 
CAAGAGTCCA 
TTCAGCTGCT 
CAGAGTCACA 
TGACCTCATG 
CTCACACTGT 
TTTGGCTCAG 
TTCACAGTAC 
ATCTAGGCAT 
TAT TTCTGT T 
ATGCTTGTCT 
AGTGTTTGTA 
TATTTATTAT 
GTCATGATTG 
TGGCTCTGGT 
AAAAAAAAAA 



aieo 

3240 

3300 
3360 
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3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 



PCTAJS02/12476 



1 
I 

MGAAGRQDPL 
GKTLUiTSSA 
TIILYG5ADE 



SRNLOOMARK 
KLFQTEHGEY 
IQATTMDGVN 
STILNLEDNV 
BIDGVDMRAE 
GTELKHMGQQ 
KDWGYNSLG 
PKPRQDC37AV 
liGKFYIWRAH 
IRHFIAYKNQ 
PSPPCRCLLK 



11 
1 

FKAMLTISWL 
TVYSIKISBG 
GIQPOPYYGL 
VHVIDPKSGT 
AMTKLGSKKF 
FNVSLSSEWV 
LSTEWYKKG 
QSWKPGDTLV 
VGLLSRNIIV 
LVGQYPIHFH 
HCFFTEDGPE 
STFWMANmN 
SNYRAGMIID 
DHGAWLRGGD 
LVTTGSPPAH 



21 
I 

TLTCFPGATS 
GKLVIKDHDE 
KYIGVGKGGA 
VIHSDRFDTY 
LHLGFRHPWS 
QDVEWTEWFD 
QDYRFACYDR 
lASTDYSMYQ 
KGEMEDKCYP 
LAGDVDERGG 
ESNTFDHCLG 
NLINCAAAGS 
NGVKTTEASA 
VUL0SCHFR6 
VSLAHS 



31 
I 

TVAAGCPDQS 
PIVLRTRHIL 
LELKGQKKLS 



FLTVKGHPSS 
HDKVSQTXGG 
GRACRSYRVR 
AEEFQVLPCR 
YRNHICNPFD 
YDPPTYIRDL 
LLVKSGTLLP 
BETGFHFIFH 
KDKRPFLSII 
EAQSGFLIiTG 



Seq ID NO: 454 DNA sequence 

Hucleic Acid Accession ft: NM_013282.2 

Coding eequencei 85.. 2466 



OGACTCCTTA 
GTCCCTCCCC 
ACCCACA06G 
CAGGAGCTGT 
GAGGACGGCC 
GTCCGCCAGA 
ACCGACTCCG 
GCGGCCGCCG 
GGGCTGTACA 
GAGGCGCAGG 
ACXSTCCAGGC 
GAGAACGGCG 
AAGTGGCAGG 
AAGGAGCGGG 
CGGGAACTCT 
TTCGTGGAOG 
CCCATGAGAC 
TGCCGGGTCT 
TGCGATGAGT 
CCCAGCGAGG 
GCGGGAGAGC 
TCACAGGG6G 
GTCCXS3TCCA 
CGAGTCCAGG 
AGCAACGAOG 
GGGAATTTTT 
GCGGAACAGT 
TTTGCTCCCA 
GTCAGGGTGG 
AACXSCTAOG 
TTTCTCGTGT 
GAGGGGAAGG 



11 

I 

GAGCATGGCA 
TCAGCGCCGA 
TGGACTCGCT 
TCCACGTQGA 
ATACCCTCTT 
GCCTCX3TGCT 
GCTGCTGCCT 
AGACTGACAG 
AOGTCAATGA 
TGGTCAGGGT 
CGGCXSCTGGA 
TGGTCCAGAT 
ACCTGGAGGT 
GCTTCTGGTA 
ACGCCAAGGT 
AAGTCTTCAA 
GGAAGAGCGO 
GCGCCTGCCA 
GCGACATGGC 
ACGAGTGGTA 
GGCTGAGAGA 
ACTGGGGCAA 
ACCACTACGG 
TCAGCGAGTC 
GAGOGTACTC 
TCACATACAC 
CTTGTGATCA 
TCAATGACCA 
TGGGCAATGT 
ATGGCATCTA 
6GC6CTACCT 
ACCGGATCAA 



21 
I 

TGGCTCAGAG 
CACCATGTGG 
GTCCAGGCTG 
GCCAGGCCTG 
C6ACTACGAG 
CCCCCACAGC 
GGGCCA6AGT 
CAG6CCAGCC 
GTACGTOGAT 
GACGCGGAAG 
GGAGGACGTC 
GAACTCCAGG 
GGGCCSIGGTG 
CGACGCGGAG 
GGTGCTGGGG 
GATTGAGGGG 
GCCGTCCTGC 
CCTGTGCGGG 
CTTCCACATC 
CTGCCCTGAG 
GAGCAAGAAG 
GGGCATGGCC 
ACCCATCCCG 
GGGTGTOCAT 
CCTAGTCCTG 
GGGTAGTGGT 
GAAACTCACC 
AGAAGGGGCC 
CAAGGGTGGC 
CAAGGTTGIG 
TCT6GGGAGG 
GAAGCTGGGG 



31 

I 

GTGCTGGTAA 
ATCCAGGTTC 
ACCAAGGTGG 
CAGAGGCTGT 
GTCCGCCTGA 
ACCAAGGAGC 
GAGTCAGACA 
GATGAGGACA 
GCTOGGGACA 
GCCCCCTCCC 
ATTTACCAOG 
GACGTCOGAG 
GTCATGCTCA 
ATCTCCAGGA 
GATGATTCTC 
GOGGGTGAAG 
AAGCACTGCA 
GGCCGGCAGG 
TACTGCCTGG 
TGCOSGAATG 
AAGGCGAAGA 
TGTGTGGGCC 
GGGATCCCCG 
GGGCCCCACG 
GCGGGGG6CT 
GGTCGAGATC 
AACACCAACA 
GAGGCCAAGG 
AAGAAIAGCA 
AAATACTGGC 
GACGATGATG 
CT6ACCATGC 



41 

I 

PELQPWNPGH 
IDKGGELHAG 
WTFLNKTLHP 
QYLNAVPDGR 
SVEDHIEYHG 
EKISDLVnCAH 
FLCGKPVRPK 
SCAPMQVKVA 
FDTFGGHIKF 
SIHHTFSRCV 
5DR0SKMCKM 
BVPTGPSVGH 
SARYSPHQDA 
MKAGGILLGG 



41 

I 

AACTGATGGG 
GGACCATGGA 
AGGAGCTGAG 
TCTACAGGGG 
ATGACACCAT 
GGGACTCCGA 
AGTCCTCCAC 
TGTGGGATGA 
CGAACATGGG 
GGGACGAGGC 
TGAAATAGGA 
CGCGCGCC06 
ACTACAACCC 
AGCGCGAGAC 
TGAACGACTG 
GGAGCCCCAT 
AGGAOGACGT 
ACCCCGACAA 
ACCOGCCCCT 
ATGCCAGCGA 
TGGCCTCGGC 
GCACCAAGGA 
TGGGCACCAT 
TGGCTGGCAT 
ATGAGGATGA 
TTTCOGGCAA 
GGGCGCTGGC 
ACTGGGGGTC 
AGTACGCCXIC 
CCGAGAAGGG 
AGCCTGGCOC 
AGTATCCAGA 



51 
1 

DQDHHVHIGQ 
SALCPFQGNF 
GGMAEGGYFF 
ILSVAVNDEG 
HRGSAAARVF 
PGKiaiRPID 
LTVTIDnNVN 
GKPMYLHIGE 
ALGPKAAHLB 
TVHGSNGLLI 
ITEDSYPGYI 
YSPGYSailP 
DPIaKPREPAI 
OEAAS6MAQG 



51 
I 

GGTTTTTGCr 
C6GGAGGCAG 
6CX3GAAGATC 
CAAAChGATG 
CCAGCTCCTG 
GCTCTCCGAC 
CCACGGOSAG 
GACGGAATTG 
GGOGTGGTTT 
CTGCAGCTCC 
OOACTAOCCG 
CACCATCATC 
CGACAACCCC 
CAGGACGGCX3 
TCGGATCATC 
GGTTGACAAC 
GAACAGACrC 
GCA6CTCATG 
CAGCAGTGTT 
GGTGGTACTG 
CACATCGTCC 
ATCTACCATC 
GTGGCGGTTC 
ACAOGGGOGG 
CGTGGACCAT 
CAAGAGGACC 
TCTCAACTGC 
GGGGAAGCCG 
CGCTGAGGGC 
GAAGTCCGGG 
TTGGACGAAG 
AGGCTACCTG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 

10 ao 

1140 
1200 
1260 
1320 
1360 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
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GAAGCCCTGG CCAACCGAGA GCGAGACaAG GAGAACAGCA AGAGGGAGG^ GGAGGAGCAG 1980 

CAGGAGGGGG GCTTCGOGTC CCCCAGGACG GGCAAGGGCA AGTGGAAGOG GftAGTOGGCA 2040 

GG1AGGTG60C OGAGCAGGGC CX3GGTCCCCG CGCC3GGACAT CCAAGAAAAC CAAGGTGGAG 2100 

CCCTACAGTC TCAOGGCCTA GCAGAGCAGC CTCATCAGAG AGGACAAGAG CA AOGC CAAG 2160 

CTCTGGAATG AGGTCCTGGC GTCACTC3iAG GAOOQGCCGG OGACOGGCAG C COGTTCCftG 2220 

TTGTTCCTGA GTAAAGTGGA GGAGAOGTTC CAGTGTATCT GCTGTCA06A GCTGGTGTTC 2280 

CGGCCCATCA CGACCGTGTG CCAGCACAAC G TGTG C AAGG ACTCCCTGGA CAGATCCTTT 2340 

CGGGCACA06 TCTTCAGCTG CCCTGCCTGC CGCTAOGACC TGGGCOGCAG CTATGCCATG 2400 

CAGGtGAACC AGCCTCTGCA GACOGTCCTC AACCAGCTCT TCCCCGGCTA C3GGCAATGGC 2460 

OGGTGATCTC CAAGCACTTC TCGACAGGCG nTTGCTGAA AAOTrGTCGG AGGGCTCGTT 2 520 

CATGGGCACr G AimCTf C TTAGTGGGCT TAACTTAAAC AGGTAGTGrT TCC TCCG TTC 2580 

OCTAAAAAGG ri TO C r rCC TTTTTTTTTA TTTTTATTTT TCAAATC TAT A CATTT TCAG 2640 

GAATTTATGT ATTCTGGCTA AAAGTTGGAC TTCTCAGTAT TGTGTTTAGT TCTTTGAAAA 2700 

CATAAAAGCX: ■reCAATTTCT OGACAAAACA ACACAAGATT TTTTAAAGAT GGAATCAGAA 2760 

ACTACGTGGT GTGGAGGCTG TTGATGTTTC TGGTCTCAAG TTCTCAGAAG TTCCTCCCAC 2820 

CAACTCTTTA AGAAGGOGAC AGGATCAGTC CTTCTCTAGG GTTCTGGCCC CCAAGGTCAG 2880 

AGCAAGCATC TTCCTGACAG CATTTTGrCA TCTAAAGTCC AGTCACATGG TTCCCCGTGG 2940 

TGGCCCGTCG CAGCCCGTCG CATGGC3GTGG CTCAfiCTGTC TGTTGAAGTT GTTGCAAGGA 3000 

AAAGAGGAAA CATCTCGGGC CTAGTTCAAA CCTTTGOCTC AAAGCCATCC CXXACCAGAC 3060 

TGCTTAGOST CTGAGATCCG CGTGAAAAGT CCTCTGCCCA OGAGAGCAGG GAGTTQGGGC 3120 

CAOGCAGAAA TCGCCTCAAG GGGACTCTGC TCCACGTGGG GCCAGGCX3TG TGACTGACGC 3180 

TGTCCGACX;A AGGCGGCCAC GGACGGACGC CAGCACACGA AGTCACX3TCC AAGTGCCTTT 3240 

GATTCGTTCC TTCTTTCTAA AGACX3ACAGT CmX i TlX^n' AGCACTGAAT TATTGAAAAT 3300 

GTCAACCRGA TTCTAGAAAC TGOGGTCATC CAGTTCTTOC TGACACOGGA TGGGTGCTTG 3360 

GGAACCGTTT GAGCCTTATA GATCATTTAC ATTCAATTTT TTTAACTCAG CAAGTGAGAA 3420 

CTTACAAGAG GGTTTTTTTT TAATTTTTTT TTCTCTTAAT GAACACATTT TCTAAATGAA 3480 

TTTTTTTTGT AGTTACTGTA TATGTACCAA GAAAGATATA AOGTTAGGGT TTGGTTGTTT 3540 

TTGTTTTTGT ATTTTTTTTC TTTTGAAAGG GTTTGTTAAT TTTTCTAATT TTACCAAAGT 3600 

TTGCAGCCTA TACCTCAATA AAACAGGGAT ATTTTAAATC ACATACCTGC AGACAAACTG 3660 

GACCAATGTT ATTTTTAAAG GGTTTTTTTC ACCTCCTTAT TCTTAGATTA TTAATGTATT 3720 

AGGGAAGAAT GACACAATTT TGTGTAGGCT TTTTCTAAAG TCCAGTACTT TGTCCAGATT 3780 
TTAQATTCTC A6AATAAATS TTTTTCACAO ATTGAAAAAA AAAAAAAA 



PCTAJS02/12476 



Seq ID NO: 455 Protein sequence 
35 Protein Accesaion fts NP_037414.2 

1 11 21 

I I I 

MWIQVRTMDG RQTHTVDSLS RLTKVEELRR 
40 YEVRLKDTIQ LLVRQSLVLP HSTKERDSEL 
PADEDMWDET ELGLYKVNEY VDAROTNMGA 
DVZYHVKYDZ} YPEIKSWQMM SRDVRARART 
AEISRKRETR TARELYANW LGDDSLNDCR 
SCKHCKDDVN RLCRVCACHL OGGRQDPDKQ 
45 PECRNDASEV VLAGERLRES KKKAKMASAT 
IPGIPVGTMW RFRVQVSESG VHRPHVAGIH 
SGGRDLSGNK RTAEQSCDQK IiTNTNRALAL 
GGKNSKYAPA EGNRYDGIYK WKYWPEKGK 
LGLTMQYPEG YLEALANRER EKEKSXREEB 
50 SPRRTSKKTK VEPYSLTAQQ S5LIREDKSN 
TFQCICC3QEL VFRPITTVOQ HNVCKDCLDR 



31 41 51 

i I 1 

KIQELPHVEP GLQRLFYRGK QMEDGHTLFD 60 

SUTDSGCCLG QSESDKSSTH GBAAAETDSR 120 

WFEAQWRVT RKAPSRDEPC SSTSRPALBE 180 

riKWQDLEVG QWMLNYNPD NPKERGFWYD 240 

IIFVDEVFKI ERPGEGSPMV DNPMRRKSGP 300 

LMCOECDMAF HIYCU)PPLS SVPSEDEWYC 360 

SSSQRDWGXG MACVGSTKEC TZVPSNHYGP 420 

GRSNDGAYSL VLAGGYEDDV DHGNPFTYTG 480 

NCFAPHTOQE GAEAKDWRSG KPVRWHNVK 540 

SGFLVWRYIiL RRDDDEPGPW TKEGKDRIKK 600 

EQQBGGFASP RTGKGKWKRK SAGGGPSRAG 660 

AKIiNHEVLAS LKORPASGSP FQXjFLSEVEE 720 
SFRAQVPSCP ACRYDLGRSY AMQVNQPLQT 



Seq ID NO: 456 DNA sequence 
Nucleic Acid Accession NM_001200.l 
55 Coding sequence: 3 2 5.. IS 14 

1 11 21 31 41 51 

GGGGACTTCr TGAACTTGCA GGGAGAATAA CTTGCXSCACC CCACTTTGOS CCGGTGCCTT 60 

60 TGCCCCAGCG GAGCCTGCTT CGCCATCTCC GAGCCCCACC GCCCCTCCAC TCCTCGGCCT 120 

TGCCCGACAC TGAGACGCTG TTCCCAGOGT GAAAAGAGAG ACTGOGCGGC CG GCAC CCGG 180 

GAGAAGGAGG AGGCAAAGAA AAGGAACGGA CATTCGGTCC TTGCGCCAGG TCCTTTGACC 240 

AGAGTTTTTC CATCTGGAOG CTCTTTCAAT GGACGTGTCC CCGCGTGCTT CTTAGACGGA 300 

CTGCGGTCTC CTAAAG6TCG ACCATGGTGG C06GGACCCG CTGTCTTCTA GCGTTGCTGC 360 

65 TTCCCCAGGT CCTCXTTCGGC GGOSCGGCTG GCCTCGTTCC GGAGCTGGGC CGCAGGAAGT 420 

TCGCGGCGGC GTCGTOGGGC CGCCCCTCAT CXTCAGCCCTC TGACGAGGTC CTGAGOGAGT 480 

TCGAGTTGCG GCTGCTCAGC ATGTTCaSGCC TGAAACAGAG ACCCACCCCC AGCAGGGAC6 540 

CCGTGGTGCC CCCCTACATG CTAGACCTGT ATOGCAGGCA CTCAGGTCAG CCGGGCTCAC 600 

CCGCCCCAGA CCACCGGTTG GAGAGGGCA6 CCAGCCGAGC CAACACTGTG CGCAGCTTCC 660 

70 ACCATGAAGA ATCTTTGGAA GAACTACCAG AAACGAGTGG GAAAACAACC CGGAGA TTCT 720 

TCTTTAATTT AAGTTCTATC CCCACGGAGG AGTTTATCAC CTCA6CAGAG CTTCAGGTTT 780 

TCOSAGAACA GATGCAAGAT GCTTTAGGAA ACAATAGCAG TTTCCATCAC CGAATTAATA 840 

TTTATGAAAT CATAAAACCT GCAACAGCCA ACTCGAAATT CCCCGTGACC AGACTTTTGG 900 

ACACCAGGTT GGTGAATCAG AATGCAAGCA GGTGGGAAAG TTTTGATGTC ACCCCCGCTG 960 

75 TGATGCGGTG GACTGCACAG GGACACGCCA ACCATGGATT OGTGGTGGAA GTGGCCCACT 1020 

TGGAGGAGAA ACAAG6TGTC TGCAAGAGAC ATGTTAGGAT AAGCAGGTCT TTGCACCAAG 1080 

ATGAACACAG CTGGTCACAG ATAAGGCCAT TGCTAGTAAC TTTTG6CCAT GATGGAAAAG 1140 

GGCATCCTCT CCACAAAAGA GAAAAACGTC AAGCCAAACA CAAACAGCGG AAACGCCTTA 1200 

AGTCCAGCTG TAAGAGACAC CCTTTGTACG TGGACTTCAG TGACGTGGGG T GGAAT GACT 1260 

80 GGATTGTGGC TCCOCCGGGG TATCACGCCT TTTACTGCCA CGGAGAATGC CCTTTTCCTC 1320 

TGGCTGATCA TCTGAACTCC ACTAATCATG CCATTGTTCA GACGTTGGTC AACTCT6TTA 1380 

ACTCTAAGAT TCCTAAGGCA TGCTGT6TCC OGACAGAACT CAGTGCTATC TCGATGCTGT 1440 

ACCTTGACGA GAATGAAAAG GTTCTATTAA AGAACTATCA GGACATGGTT GTGGAGGGTT 1500 
GTGGGTGTOG CTAOTACAGC AAAATTAAAT ACATAAATAT ATATATA 

85 

Seq ID NO: 457 Protein sequence 
Protein Accession #: NP_001191.1 



357 
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1 11 21 31 41 51 

MVAGTRCLIA LLLPQVUiGG AAGLVPSLGH RKPAAASSGR PSSQPSDEVL SEFE LRLT ^ S M 60 

PGLKQRPTPS RDAWPPYML DX#YRRHSGQP GSPAPDHRLB RAASSAilTVR SFHHEESLES 120 

ZJOTSGKTTR RFPFNLSSIP TBEPITSAEL QVPaBQHQDA LCaOlSSFHHH INIYEIIKPA 180 
TANSKFPVTR LUDT 

Seq ID NO: 4S8 DKA sequence 

Nucleic Acid Accession #s X'M_001999.2 

Coding sequence : 1 . . 8736 

1 11 21 31 41 51 

ATGGGGAGAA GACGGAGGCT GTGTCrCCAG CTCTACTTCC TGTGGCTCGG CTGTGTGGTG 60 

CTCTGGGCGC AGGGCACXSGC CGGCCAGCCT CAGCCTCCTC CGCCCAAGCC GCCCO GGOCC 120 

CAGCCGCOGC OGCAACAGGT TOGGTCCGCT ACAGCAGGCT CTGAAGGCGG GTTTCTAGCX; 180 

CCCGW3TATC CGGAGGAGGG TCSCCGCAGTG GCCAGCCGCG TCCGCCGGCG AGGftCAGCAG 240 

GAOGTGCTCC GAGGGCCCAA OGTGTGCGGC TOaGATTOC ACTCCTACTG CT GCCCTG GA 300 

TGGAAGACGC TCCCXGGAGG AAACCAGTGC ATTGTCCOGA TTrCTAGAAA TAGTTGTGGA 360 

GATGGATTTT GTTCCCGTCC TAACATGTGT ACTTGTTCCA GTGGGCAAAT ATCATCAACC 420 

TGTGGATCAA AATCAATTCA GCAGTGCAGT GTGAGATGCA TGAATGGTGG GACCTGTGCA 480 

GATCACCACT GCCAGTCCCA GAAAGGATAT ATTGGAACTT ATTC3TGGACA ACCTGTCTGT 540 

GAAAAIGGAT GTCAGAATGG TGGAOGTTGC ATCGCCCAAC CGTGTGCTTG TGTTTATGGG 600 

TTGACTGGTC CACAGTGTGA AAGAGATTAC AGGACAGGCC CGTGTTTCAC TCA<3GTCAAC 660 

AACCAGATGT GCCAAGGGCA GCTGACAGGC ATTGTCTGCA aSAAGACTCT GTGCTGTGCC 720 

ACCACTGGAC GGGOGTGGGG CCATCCCTGT GAGATGTGTC CAGC CCAGC C TCAGCCCTGC 780 

CGACGGGGTT TCATCCCCAA CATCCGCACT GGAGCTTGCC AAGATGTTGA TGAA TGCCAG 840 

GCTATCCCAG GGATATGCCA AGGAGGAAAC TGTATCAATA CAGTGGGCTC TTTTGAATGC 900 

AGATGCCCTG CTGGTCACAA ACAGAGTGAA ACTACTCAGA AATGTGAAGA CATTGATGAG 960 

TGCAGCATCA TTCCTGGGAT ATGTGAAACT GGTGAATGTT CCAACACOGT GGGAAGCTAT 1020 

T rnvfg m gtccacotgg atatgtaacc tcaacagatg gctctcxsatg catogatcag loao 

AOAACAGGCA TGTGTTTCTC GGGCCTGGTG AATGGCCGCT GTGCACAAGA GCTCCCGGGG 1140 

AGAATGACGA AAATGCAGTG CTGCTGTGAG CCTGGCCGCT GCTGGGGCAT CX^QAACCATT 1200 

CCTGAAGCCT GTCCTGTCAG AGGTTCTGAG GAATATCGCA GACTTTGCAT GGATGGACTT 1260 

CCRATGGGAG GAATTCCAGG GAGTGCTGGT TCCAGACCTG GAGGCACTGG GGGAAATGGC 1320 

TTTGCCC5CAA GTGGCAATGG CAATGGCTAT GGOCCAGGAG GGACAGGCTT CATCCCCATC 1380 

CCTGGAGGCA ATCGCTTTTC TCCTGGOGTT GGGGGAGCCG GTGTGGGGGC CGGGGGACAG 1440 

GGACCTATCA TCACTGGACT AACAATTCTG AACCAGACAA TAGATATCTG TAAGCATCAT 1500 

GCTAACCTTT GTTTAAATGG ACGCTGTATA <XAACTGTCT CAAGCTACC3G ATGTGAATGC 1560 

AACATGGGTT ATAAGCAGGA TGCAAATGGA GATTGTATAG ATGTTGATGA ATGCACATCA 1620 

AATCCCTGCA CTAATGGAGA TTGTGTTAAC ACACCTGGTT CCTATTATTG TAAATGTCAT 1680 

GCTGGATTCC AGAGGACTCC TACCAACCAA GCATGCATTG ATATTGATGA GTGCATCCAG 1740 

AATCGGGTTC TTTGTAAAAA CGGTCGATGC GTGAACTCiMS ATGGAAGTTT CCAGTGCATT 1800 

TGCAATGCCG GCTTTGAATT AACTACAGAT GGAAAAAACT GTGTTGATCA TGATGAATGT I860 

ACAACTACCA ACATGTGTTT GAATGGAATG TGCATCAATG AAGATGGCAG CTTCAAGTGC 1920 

ATCTGCAAAC CAGGATTTGT CTTGGCTCCA AATGGGOGTT ACTGTACTGA TGTTGATGAA 1980 

TGCCAGACCC CAGGAATCTG CATGAATGGG CACTGCATCA ACAGTGAAGG GTCCTTCCGC 2040 

TGTGACTGTC CCCCAGGCCT GGCTGTGGGC ATGGATGGAC GTGTGTGTGT TGATACTCAC 2100 

ATtSOGCAGTA CCTGCTATGG AGOAATCAAG AAAGGAGTCT GTGTGCGTCC TTTCCCOGGT 2160 

GCAGTGACCA AGTCCGAATG CTGCTGTGCC AATCCAGACT ATGGTTTTGG AGAACCCTGC 2220 

CAGCCATGCC CTGCAAAAAA TTCAGCTGAA TTCCAOGGCC TTTGTAGTAG TGGAGTAGGT 2280 

ATCACTGTGG ATGGAAGAGA TATCAATGAA TGTGCTTTQG ATCCTGATAT ATGT6CCAAT 2340 

GGGATTTCTG AAAACTTACG TGGTAGTTAC CGTTGTAATT GCAACAGTGG CTATGAACCA 2400 

GATGCCTCTG GAAGAAACTG TATTGACATT GATGAATGTT TAGTAAACAG ACTGCTTTGT 2460 

GATAACGGAT TGTGCCGAAA CACGCCAGGA AGTTACAGCT GTACGTGCCC ACCAGGGTAT 2S20 

GTGTTCAGGA CTGAGACAGA GACCTGTGAA GATATAAATG AATGTGAAAG CAACCCATGT 2S80 

GTCAATGGGG CCTGCAGAAA CAACCTTGGA TCTTTCAATT GTGAATGTTC GCCCGGCAGC 2640 

AAACTCAGCT CCACAGGATT GATCTGTATT GACAGCCTGA AGGGGACCTG TTGGCTCAAC 2700 

ATCCAGGACA GCCGCTGTGA GGTGAATATT AATGGAGCCA CTCTGAAATC TGAATGCTGT 2760 

GCCACCCTCG GAGCCX5CCTG GGGGAGCCCC TGTGAGCGGT GTGAACTAGA TACAGCTTGC 2820 

CCAAGAGGGC TTGCCAGGAT TAAAGGTGTT ACGTGTGAAG ATGTTAATGA GTGTGAGGTG 2880 

TTCCCTCGOG TTTGTCCAAA TGGAtXXrTGT GTCAACAGTA AGGGATCTTT TCATTGCGAG 2940 

TGCXXrrGAAG GCCTTACGTT GGATGGGACT GGCOGTGTAT GTTTGGATAT TCGCATGGAG 3000 

CAGTGTTACT TGAAGTGGGA TGAAGATGAA TGCATCCACC COGTTCCTGG AAAGTTCCGC 3060 

ATGGATGCCT GCTGCTGTGC TGTCGGGGCG GCTTGGGGCA COQAGTGTGA GGA6TG0C0C 3120 

AAACCTCGCA CCAAGGAATA CGAGACACTG TGCCCCCX30G GGGCTGGCTT TGCTAACOGA 3180 

GGGGATGTTC TTACTGGGCG GCCATTTTAC AAAGACATCA ATGAATGCAA AGCATTTCCT 3240 

GGGATGTGCA CTTATGGGAA GTGCAGAAAT ACAATCGGAA GCTTCAAATG CCGTTGCAAT 3300 

AGTGGCTTTG CTCTAGACAT GGAGGAAAGA AACTGCACGG ACATCGACGA GTGCAGGATT 3360 

TCTCCTGACC TCTCTGGCAG TGGAATCTGC CffCAATACAC CGGGCAGCTT TGAGTGOGAG 3420 

TGCTTOGAAG GCTATGAAAG TGGCTTCATG ATGAT6AAGA ACTGCATGGA CATTGAOGGA 3480 

TGTGAACGTA ACCCTCTCCT TTGTAGGGGT GGCACCTGTG TGAACACTGA GGGCAGCTTT 3540 

CAGTGTGACT GCCCACTGGG ACACGAGCTG TCACCATCCC GTGAGGACTG TGTGGATATT 3600 

AATGAATGCT CCCTGAGTGA CAATCTCTGC AGAAATGGAA AATGTGTGAA CAT GATT GGA 3660 

ACCtATCAGT GCTCTTGCAA TCCTGGATAT CAGGCTACGC CAGACCGCCA GGGCTGTACA 3720 

GATATtGATG AATGTATGAT AATGAAOGGA GGCTGT6ACA CXXAGTGCAC AAATTCAGAG 3780 

GGAAGCTAOC AATGCAGCTG CAGT6AGGGT TATGCCCTGA TGCCAGATGG GAGATCGTGT 3840 

GCAGACATTG ATGAATGTGA AAACAATGCT GATATCTGTG ATGGCGGCCA GT6TA0CAAC 3900 

ATTCCTGGAG AGTATCGCTG CCTCTGCTAT GATGGCTTCA TGGCTTCCAT GGACATGAAA 3960 

ACATGCATTG ATGTCAATGA ATGTGACCTA AATTCAAATA TCTGCATGTT TGGGGAATGT 4020 

GAGAACACAA AGGGATCCTT CATTTGCCAC TGTCAGCTGG GTTACTCAGT GAAGAAGGGG 4080 

ACCACAGGAT GTACAGATGT GGATGAGTGT GAAATTGGTG CTCATAACTG CGACATGCAT 4140 

GCCTCATGTC TGAATATCGC AG6AAGCTTC AAGTGTAGCT GCAGAGAAGG CTGGATTGGA 4200 

AAOGGCATCA AGTGTATTGA TCTGGAGGAA TGTTCTAATG GAACCCACCA G TGTAGC ATC 4260 

AATCCTCAGT GTGTAAATAC CCXX3GGCTCA TACOGCTGTG CCTGCTCCGA AGGTTTCACT 4320 

GGIGATGGCT TTACXTGCTC AGATGTTGAT GAGTGTGCAG AAAACATAAA CCTCTGTGAG 4380 
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AACGGACAGT GCCTTAATGT CCCGGGTGCA TATCGCTGOG AGTGTGAGAT GGGCTTOlCT 
CCAGCCTCAG ACAGCAGATC CTGCCAAGAT ATTGAJTCAAT GCTCCTTCCA AA ACATTTG T 
CTCTCTGGAA CATCTAATAA CCTGCCTGGA ATCTTTCATT GCATCTGOGA TCATGGTTAT 
GAATTGGACA GAACAGGAGG GAACTGTACA GATATTGATG AGTGTGCAGA TCCTATAAAC 

GOCTA-rorcr caacaogcct ggtogctatc agtctaactc ccouxxssat 
Stcacttca acccaactcg tctgggttgt gttgacaaoc gtgtgggcaa ctgctacctg 

AAGTTTGGAC CTGGAGGAGA TC3GGAGTCTG TCTTGCAACA CCGAGATCGG GGTGGGCGTC 
AOTCGCTCTT CATC3CTCCTG CTCTCTGGGA AAGGCCIGGQ CAAACCCCTG TGftGACATGC 
SccCTGTCA ATAGCACTGA ATATTACACC CTGTGTCCCXS GAGGTGAAGG CTTCAGACCT 
SccCCATCA CAATCATTTT AGAAGACATT GACGAATGCC AGGAGTTACC AGGTCTCTGC 
^GGTC^ ACTGCATCAA CACTTTTGGG AGCTTCCAGT GTGAGTCCCC ACAAGGCTAC 
SSt^^ A^CCCG CATCTCTGAG GATATTGATG AGTCrTTTGC ACATCCTGGT 
J^^^: ^ACCTG CTATAACACC CIGGGAAATT ACACCTGCAT TTGCCCACCT 
^I^SS^ ^Satgg AGGCCACAAC TGCATGGACA TGAGAAAAAG ctttt^ac 
^ctIS ^AACCAC TTCTGAGAAT GfiGTTGCCTT TCAATGTGAC AAAAAGGATG 

?Stcctgca catataatot gggcaaagct gggaacaaac cttgtgaacc atcccoact 
cca^aSg cxgactttaa aaccatatgt ggaaatattc ctggattcac ctttgacatt 
mgctcttga catigatgaa tctaaagaga ttccaggcat ttctgcaaat 
^SgtoS ttaaccagat tggcaotttc cgctgtgaat gccc^cagg JTTC^™^ 
^TCACCTGC tgttggtttg tgaagatata gatcaotgca gcaatggtga taatctctcc 
^^^SSaTC ^A^CAT CAATAGTCCr ggtagttacc gctgtgaatc 'toocgooqct 
SSaacttt c^ccaatgg ggcctgtgta gatcgcaatg aatctttaga aattcctaac 

^JSSS SScTTGTG TCTTGATCre CAAGGAAGTT ACCAGTGCAT CTGCCACAAT 

StoSgga ccagaccatg tgcatggatg ttgatcagtg og^°°gcac 

cStSg^ ATGGAACTTG TAAAAACACC.GTTGGATCCT ATAACTGTCr GTOTACCCA 

^^^??TCAAC tSctcataa taatgattgc ctggacatag atgagtgcag TTCurum 
gcagaaatgg aogttgtttt aatcaaattg gttctttcaa ^^^f^ 
aaog^tt atgaacttac cccagatggc aaaaactgta tagacactaa tgagtgtgtc 
GCCmCCOG gctcttgctc tcctggtacc tctcagaatt tggagggatc cttcagatgc 
^<^iS. actaaaaagc gagaactgca ttgatataaa tgaat^^ 
^tcSa acatttgtct ttttggttcc tgtactaata ctccaggggg ctt^gtgc 
ctctgccccc ctggctttgt actatcigat aatggacgga gatgctttga tactcgccag 

^^StCTGCT TCACAAATTT TGAAAATGGA AAGTGTTCTG TAOOCAAAGC TtTCAA^CC 
^AAAGCAA AATGCTGCTG TAGTAAGATC CCAGGAGAGG GCTGGGGGGA COOCTOTGAG 
^S?^CCA AaSSaTGA AGTTGCATrr CAGGATTTGT GTOCATATGG CCATGGAACT 

?$S^TAC ACGTGAAGAT gtcaatgagt gtcttgagag cccagg^tt 
?S5^SJtg gtcaatgtat caacaccgac ggatcttttc gctgtgaatg tccaat^c 
tScaaccttg actacactgg agtacgctgt gtggatactg atgagtgttc aatosg^t 

SSStGGAA aSgTACATG CACCAATGTT ATTGGGAGTT TTGAATGCaA TTGCAATGAA 

SSgcccat gatgaattgt gaagatatca acgaatgtgc ccagaaccca 
SS?^G?G ctSacgctg catgaacact tttgggtcct atgaatgcac gtgccosatt 
ggctatgccc tcagggaaga tcaaaagatg tgcaaagatc tggatgaatg tgctgamgg 
??^S^CT gtgaatctag gggcatgatc tgtaagaatc taatcggcac cttcatgtgc 
Sctgccctc ctggaatggc ccgaaggocc gatggagaag gctgtgtaga tgaaaatgaa 

TCCAgScS AGCCAGGAAT CTGTGAAAAT GCACGTTGTG TTAACATTAT TC3GAAGCTAT 
GT^GAAGG ATTCCAGTCA AGTTCTTCAC GCACTGAATG CCTT^CAAT 
^CAGGGTC TCTGCTTTGC AGAGGTACTG CAGACAATAT GTCAAATCGC ATCCAGTAGT 
S^?^ tSctaagtc AGAATGCTGC TGTGATGGTG GGOGAGGCTG GGOCCACCAG 
TGCGAGCTTT GCCCACTTCC TGGAACTGCC CAGTACAAAA AGATATGTCC TCATGGCCCA 
GGATATACAA CTGATGGAAG AGATATTGAT GAATGTAAGG TAATGCCAAA CCTCTGCACC 
AATGGTCAGT GCATCAATAC CATGGGCTCA TTCCGATGCT TCTGCAAGGT TGGCTACACC 
i^^ScA GTGC5AACCTC TTGTATAGAQ CTTGATGAAT GCTCCCAGTC CCCGAAACCA 

tgcaactaca tctgcaagaa cactgagggg agttatcagt gttcatgtoc gaggggctat 

GTCCTGCAAG AGGATGGAAA GACATGCAAA GACCTTGATG AATGTCAAAC fAWSCAGCAT 

aactgccagt tcctctgtgt caacaccctg ggggggttta cctgtaaatg tccacctggt 

TTCACACAGC ATCACACTGC TTGTATOGAC AACAACGAAT GTGGGTCTCA ACCTTTGCTT 
TGXGGAGGAA AGGGAATCTG TCAAAACACT CCAGGCAGTT TCAGCTGTGA atgccaaaga 

gggttctctc ttgatgccac oggactgaac tgtgaagatg ttgatgaatg tgatgggaac 
cacaggtgcc aacacggctc ccagaacatc ctgggtggct acagatgtgg ctgcccccaa 
ggctacatcc agcactacca gtggaatcag tgtgtcgatg agaatgamg ctoamccc 
aatccctgtg GCTCTGCrrC ctgctacaac accctgggga gttacaagtg cgcctgcccc 
toggggttci ccttcgacca gttctccagt gcctgccaog aoctgaatga gtgctcgtcc 

TCCAAGAACC CCTGCAATTA CGGCTGCTCT AACACGGAGG GGGGCTACCT CTGTGGCTGC 
CCCCCTGGGT ATTACAGAGT GGGACAAGGC CACTGTGTCT CAGGAATGGG ATTTAACAAG 

GGGCAGTACC tgtcactoga tacagaggtc gatgaggaaa atgctctgtc cccagaagca 
tcctaogagt gcaaaatcaa cggctatcct aagaaagaca gca^cagaa gaga^tatt 
catgaacctg atcccactgc tgttgaacag atcagcctag agagtgtcga catggacagc 
cccgtcaaca tgaagttcaa cctctcccac ctcggctcta aggagcacat cctggaacta 

AGGCCOGCCA TCCAGCCCCT CAACAACCAC ATCCGTTATG TCATCTCTCA AGGGAACGAT 
GACAGCGTCT TCOGCATCCA CX»AAGGAAT GGGCTCAGCT acttgcacac ggccaagaag 
aagctcatgc ccxxscacata cacactggaa atcactagca tccctctcta caagaagaag 
gagcttaaga aactggaaga gagcaatgag gatgactacc tcctagggga gcttggggag 
gctctcagaa tgaggctgca gattcagctc tattaacxxst tcacagactt gggcccaggc 
tcaaatccta gcacagccag tctgcagaag catttgaaaa gtcaaggact aattttaaag 

AG6AAAAATA ATAATAACTC TTGTTTCTTT CCTCCCTGTC TTAGACTTTG AATGTTGACC 
ctcacaggga GGGATAATTT AGACTCTGGT ATGGCCAAAG ATTTGAGCTC AAAGGCAACC 
GTGGTTACTG TATTTTTTAT ATAACTTCAT TTTAAAATAT ATTAAAAGAA ACCTAAATG T 

tcaagatatc agcatatggc actaaatgca caaaaataat gtgagctttt 'i^-^ii^^ 

CCTGTTAGCA GTCTGTAACA CTTTGQGTAT TTTOCTATAG TTGCTAATTA AAAAAATATA 

gatgtttatt tatttttaat gcagtaatat atggagaaat gaacaaacta tgtaaacaaa 
aagggaaact cacttctttt tctttagatt tataaatttg agctattttt tttagaggtg 
ctttttaaaa atccaataga tacaagagat gtttcctttg gttttctgcc agtcatccag 
cigatacaca cctgatcgat tttaaagaaa gccacacaca gctgaatcgg gcagtgctaa 
tcaataattt aaaagacatg aatgtcatta gatcctttat aacgtagatc gaagccaaag 
cagctcattt gtgacaacat ttcatatcac cagacacacc J^gcaacaga agtttgaag^ 
caaccactgt agcaaaatac cttgactgct tgtgagacca ttago^c aggccaaacc 
gtactgtatt tccttctcat aacctcaagg aaccatatgt gctacccaca acacctcatt 
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CTTA^;^ CT^CTGCK TCCTCATGGT ACTCTAGGCA CCSGMGMC OGCCGTTOOC 9660 
TTCAAAGGGA ACACCTGGCA TTCTGTGGTG TTTOSTGCTG TCTTAAATAA TGGTGCATTT 9720 
ATEATCrrCA AGTTATTTCA GGATTGCCAT ATG TGCAAAC AAATCATGCA ATGCAGCCAA 9780 
G(5AATATATG TTGTTGTTGT TGTTTTAAAC CCAlTriil'r TTTAGAATTT TCATTAATAC 9840 
TGTAGTTATA CACCATATGC CTCATTTTAT CATAGCCTAT TGTGTATGAA AGATGTTTGT 9900 
ACAAXtSAATT GATGTTTAGT TTGCTTTAGT CATTTAAAAA GATATTGTAC CAGGATGTGC 9960 
TATTAAGAGC ACGTATCCAT TATTCTTCTC AACCCAAGRA CCTCTTTCCT GGACCAGrnSl 10020 
CCAAACCTCA TA1<3TGAAAT GGCCAAAGCA CATGCAGGCT Cd WinXSTT C CTCTCAAAC 10080 
CTGTGCTCAC CAAAGATTAG TAACCAGTTA TAOCCACTAT TTTGAGGTTT TATTGTTTTT 10140 
TTAATAACTA AAAAAAAACT OGTGCC 



Seq ID HO: 459 Protein sequence 
Protein Accession #: MP_OO1990.1 

1 11 21 31 41 51 

1 1 I i 1 I 

^OUlRRIlCLQ LYPLWIiGCW LWAQGTAGQP QPPPPKPPRP QPPPQQVRSA TAGSEGGFLA €0 

PEVREEGAAV ASRVRRRGQQ DVLRGPNVOG SRFHSYCCPG WKTLPGCKQC IVPICRKSCG 120 

DGPCSRPKMC TCSSGQISST OGSKSIQQCS VRCMNGCTrCA DDHOQCQXGY IGTYOGQPVC 180 

ENGCQNGGRC lAQPCACVYG FTGPQCEBDY RTGPCPTQVM KQMCQGQLTG IVCrKTLCCA 240 

TTGRAWGHPC EKCPAQPQPC RRGFIPNIRT GACQDVDSC3Q AIPGICQGGN CINTVGSFEC 300 

RCPAOOCQSE TTQKCEDIDE CSIIPGICET GKCSNTVGSY FCVCPRGYVT STDGSRCIDQ 360 

RTGHCPSGLV NG8CAQEI.PG RMTKMQCCCE PGRCHGIGTI PEACPVRGSE EYRRLCKDGL 420 

FMGGIPGSAG SRPGGTGGNG PAPSOIGKGY GPGGTGPIPI PG<2IGFSPGV G6AGVGAGGQ 480 

GPIlTCLTIIi NQTIDICKHH ANIiCLNGRCI PTVSSYRCEC NMGYKQDANG DCIDVDECTS 540 

NPCTNGDCVN TPGSYYCKCH AGFQRTPTKQ ACIDIDECIQ NGVLCKNGRC VNSDGSFQCI €00 

C»AGFELTTD GKNCVDHDSC TTTNMCLNGM CINEDGSFKC ICKPGPVIAP NGRYCTWDE €60 

CQTPGICMNG HCINSEGSFR CDCPPGLAVG MDGRVCVDTH MRSTCYGGIK KGVCVRPFP6 720 

AVTKSECCCA NPDYGPGEPC QPCPAKNSAE FHGLCSSGVG ITVDGRDINE CALDPDICAM 780 

GICENIiRGSY RCNCNSGYEP DASGRNCIDI DECLVNRLLC DNGLCRNTPG SYSCTCPPGY 840 

VFRTETETCE DIKECESNPC VNGACRNNLG SFNCECSPGS KLSSTGIilCI DSLKGTCWIiN 900 

IQDSRCEVNI NGATLKSECC ATLGAAHGSP CERCELDTAC PRGLAHIKGV TCEDVNECEV 960 

FPGVCPKGRC VNSKGSFHCE CPEGLTU)GT 6RVCLDIRME QCYLKWDEDB CIHPVPGKFR 1020 

MDACCCAVGA AWGTECEECP KPGTKEYETl. CPRGAGPANR GDVLTGRPFY KDINECKAFP 1O80 

GMCTYGKCRN TIGSFKCRCN SGFALDMBER NCTDIDECRI SPDLCGSGIC VNTPGSFECE 1140 

CPEGYESGFM MMKHOIDIDG CERNPLLCRG GTCVNTBGSF QCDCPLGHEL SPSREDCVDI 1200 

NECSLSDNLC HKGKCVNMIG TYQCSCNPGY OATPDRQGCT DIDECMIMNG GCDTQCraSE 1260 

GSYECSCSEG YAUTPDGRSC ADIDECENNP DICDGGQCTN IPGEYRCLCY DGFMASMDMK 1320 

TCIDVNECDIi NSNICMFGEC ENTKGSPICH CQLGYSVKKG TTGCTDVDEC EIGAHNCDMH 1380 

ASCLNIPGSP KCSCREGWIG NGIKCIDLDE CSNGTHQCSI NAQCVNTPGS YRCACSEGPT 1440 

GDGFTCSDVD ECAENINI*CE NGQCLNVPGA YRCECEMGFT PASOSRSCQD IDECSPC»IIC 1500 

VSGTCMNLPG MFHCICDDGY ELDRTGGMCT DIDECADPIN CVNGLCVNTP GRYEOJCPPD 1560 

FQLNPTGVGC VDNRVGMCYI* KFGPRGDGSL SCMTEIGVGV SRSSCCCSLG KAWGNPCETC 1620 

PPVNSTEYYT LCPGGEGFRP NPITIILEDI DEOQEl^LC QGGNCINTFG SFQCECPQGY 1680 

YLSEDTRICE DIDBCPAHPG VCGPGTCYIIT LGNYTCICPP EYMQVNGGHN CMDMRKSFCY 1740 

RSYNGTTCEN ELPFNVTKRM CCCTYNVGKA GMKPCEPCPT PGTADFKTIC GNIPGFTFDI 1800 

HTGECAVDIDE CECEIPGICAN GVCINQIGSF RCECPTGPSY NDLLLVCEDI DECSNGDNLC 1860 

QRNADCINSP GSYRCECAAG FKLSPNGACV DHNECLEIPN VCSHGLCVDL QGSYQCICHN 1920 

GFKASQDQTM C34DVDECERH PCGNGTCKNT VGSYNCLCYP GFELTHNNDC LDIDECSSFF 1980 

GQVCaaiGRCF NEIGSPKCLC NEGYELTPDG KNCIDTNBCV ALPGSCSPGT CQNLEGSFRC 2040 

ICPPGYEVKS ENCIDINECD EDPWICLFGS CTNTPGGFQC LCPPGFVLSD NGRRCFDTRQ 2100 

SPCFTNFENG KCSVPKAFNT TKAKCCCSKM PGBGWGDPCE LCPKDDBVAF QDLCPYGHGT 2160 

VPSLHDTRED VNECLESPGI CSNGQCINTD GSFRCECPMG YNLDYTGVRC VDTDECSIGM 2220 

POGNGTCTNV IGSPECNCNE GPEPGPMMNC EDINECAQNP LLCALRCMNT PGSYECTCPI 2280 

GYALREDQKM OCDLDECAEG LHDCESRGMM CKNLIGTFMC ICPPGMARRP DGEGCVDEME 2340 

CRTKPGICEN GRCVNIIGSY RCECNEGFQS SSSGTECLDN RQGLCFAEVL QTICQMASSS 2400 

RNLVTKSECC CDGGRGWGHQ CELCPLPGTA QYKKICPHGP GYTTDGRDID ECKVMPNLCT 2460 

NGQCIMTMGS FRCFCKVGYT TDISGTSCID LDECSQSPKP CSYICKNTEG SYQCSCPSGY 2520 

VLQEDGKTCK DLDECQTKQH NCQPLCVNTL GGFTCKCPPG FTQHHTACID NNECXSSQPLL 2580 

CGGKGICQNT PGSPSCECQR GFSIiDATGLK CEDVDECDGN HRCQHGCX3NI LG6YRCGCPQ 2640 

GYIQHYQWNQ CVDENECSNP NACX3SASCYN TLGSYKCACP SGFSFDQPSS ACHDVNECSS 2700 

SKNPCNYGCS NTEGGYLCGC PPGYYRVGQG HCVS<»1GFNK GQYLSLDTEV DEENAIiSPEA 2760 

CYECKINGYP KKDSRQKRSI HSPDPTAVEQ ISI»ESVDMDS PVNMKFNLSH LGSKEHH.EL 2820 

RPAIQPLNNH IRYVISQQTO DSVPRIHORM GLSYLHTAKK KLMPGTYTLE ITSIPLYKKK 2880 
ELKKLEESNE DDYLLGELGE ALRMRLQIQL Y 



seq ID HO: 460 SNA sequence 

Nucleic Acid Accession %z NM_013372.1 

Coding sequence: 63.. 617 

1 H 21 31 41 51 

GCGGCCGCAC TCAGCGCCAC GCX3TCX3AAAG CGCAGGCCCC GAGGACCCGC CGCACTGACA 60 

GTATGAGCCG CACAGCCTAC ACGGTGGGAG CCCTGCTTCT CCTCTTGGGG ACCCTGCTGC 120 

(XGCTGCTGA AGGGAAAAAG AAAGGGTCCC AAGGTGCCAT CCCCCCGCCA GACAAGGCCC 180 

AGCACAATGA CTCAGAGCAG ACTCAGTOGC CCCAGCAGCC TGGCTCCAGG AACOGGGGGC 240 

GGGGCCAAGG GCGGGGCACT GCCATGCCCG GGGAGGAGGT GCTGGAGTCC AGCCAAGAGG 300 

CCCTGCATGT GACGGAGCGC AAATACCTGA AG0GAGACT6 GTGCAAAACC C^CGGCTTA 360 

AGCAGACCAT CCACGAGGAA GGCTGCAACA GTCGCACCAT CATCAACOGC TTCTGrTAOS 420 

GCCAGTGCAA CTCTTTCTAC ATCCCCAGGC ACATCOGGAA GGAGGAA6GT TCCTTTCAGT 480 

CCTGCTCCTT CTGCAAGCCC AAGAAATTCA CTACCATGAT GGTCACACTC AACTGCCCTG 540 

AACTACAGCC ACCTACCAAG AAGAAGAGAG TCACACGTGT GAAGCAGTGT CGTTGCATAT 600 

CCATCGATTT GGATTAAGCC AAATCCAGGT GCACCCACCA TGTCCTAGGA ATGCAGCCCC 660 

AGGAAGTCCC AGACCTAAAA CAACCAGATT CTTACTTGGC TTAAACCTAG AGGOCAGAAG 720 

AACCCCCAGC TGCCTCCTGG CAGGAGCCTG CTTGTGOGTA GTTOOlXiTCC ATGAGTGIGG 780 

ATGGGTGCCT GTGGGTGTTT TTAGACACCA GAGAAAACAC AGTCTCTGCT AGAGAGCACT 840 

CCCTATTTTG TAAACATATC TGCTTTAATG GGGATGTACC AGAAACOCAC CTCACCCOGG 900 
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CTCACATCTA AAGQGGOGGG GCCGTGGTCT GGTrCTCACT TTGTGTTTTT GTGCOCTCCT 960 

GGGGACCAGA ATCTOCTTTC GGAATGAATG TTCATGGAAG AGG CTCCTC T GAGGGCAAGA 1020 

GACXrrGTTTT AGTGCTGCAT TOGACATGGA AAAGTCCTTT TAAOCTGTCC TTGCATCCTC 1080 

CTTTCCTCCT CXTTCCTCACA ATCCATCTCT TCTTAAGTTG ATAGTGACTA T GTCAGTC TA 1140 

ATCTCTTCTT TGCCAAGGTT CCTAAATTAA TTCACTTAAC CATGATGCAA ATGTTrTTCA 1200 

TTTK3TGAAG ACCCTCCAGA CTCTGGGAGA GGCTGGTGTG GGCAAGGACA AGCAGGATAG 1260 

TGGAGTGAGA AAQGGAGGGT GGAGGGTGAG GCCAAATCAG GTCCAGCAAA AGTCAGTAGG 1320 

GACATTGCAG AAGCTTGAAA GGCCAATACC AGAACACAGG CTGATGCTTC TGAGAAAGTC 1380 

TrrrCCTAGT ATTTAACAGA ACCCAAGTGA ACAGAGGAGA AATGAGATTG CX:AGAAAGTG 1440 

ATTAACmG GCCGTTGC3UV TCTGCTCAAA CCTAACACCA AACTGAAAAC ATAAATACTG 1500 

ACCACTCCTA TGTTOSGACC C3UW3CAAGTT AGCTAAACCA AACCAACTCC TCTGCTTTGT 1560 

CCCTCAGGTG GAAAAGAGAG GTAGTTTAGA ACTCTCTGCA TAGGGGTGGG AATTAATCAA 1620 

AAACCKCAGA GGCtGAAATT CCTAATACCT TTCCTTTATC GTGGTTATAG TCAGCTCATT 1680 

TCCATTCCAC TATTTCCCAT AATGCTTCTG AGAGCCACTA ACT TGAT PGA TAAAGATCCT 1'740 

GCCTCTCCTG AGTGTACCTG ACAGTAAGTC TAAAGATGAH A6AGTTTAGG GACTACTCTG 1800 

TTTTAGCAAG ARATATTKTG GGGGTCTTTT TGTTTTAACT ATTGTCftGGA GATTGGGCTA I860 

HAGA6AAGAC GAOGAGAGTA AGGAAATAAA GGGRATTGCC TCPGGCTAG A GAGTAAGTTA 1920 

GGTCTTAATA CCTGGTAGAA ATGTAAGGGA TATGACCTCC CTTTCTTTAT GTGCTCACTG 1980 

AGGATCTGAG GGGAOCCTGT TAGGAGAGCA TAGCATCATG ATGTATTAGC TGTTCATCTG 2040 

CTACTGGTTC GATG6ACATA ACTATTGTAA CTATTCAGTA TTTACTGGTA GGCACTGTCC 2100 

TCTGATTAAA CTIGGCCTAC TGGCAATGGC TACTTAGGAT TGATCTAAGG GCCAAAGTGC 2160 

AGGGTGGGTG AACTTTATTC TACTTTGGAT TTGGTTAACC TGnTrCTTC AAGCCTGAGG 2220 

TTTTATATAC AAACTCCCTG AATACTCTTT TTGCCTTGTA TCTTCTCAGC CTCCTAGCCA 2280 

AGTCCTATGT AATATCGAAA ACAAACACTG CAGACTTGAG ATTCAGTTGC OGATCAAGG C 2340 

TCTGGCATTC AGAGAACCCT TGCAACTCGA GAAGCTGTTT TTATTTCGTT TTTGTTTTGA 2400 

TCCAGTGCTC TCCCATCTAA C3UVCTAAACA GGAGCCATTT CAAGGCCGGA GATATTTTAA 2460 

ACACCCAAAA TGTTGGGTCT GATTTTCAAA CTTTTAAACT CACTACTGAT GATTCTCACG 2520 

CTAGGCGAAT TTGTCCAAAC ACATAGTGTG TGTGTrTTGT ATACACTGTA TGACCCCACC 2S80 

CCAAATCTTT GTATTOTCCA CATTCTCCAA CAATAAAGCA CAGAGTGGAT TTAATTAAGC 2640 

ACACAAATGC TAAGGCAGAA TTTTGAGGGT GGGAGAGAAG AAAAGGGAAA GAAGCTGAAA 2700 

ATGTAAAACC ACACCAGGGA GGAAAAATGA CATTCAGAAC C3W3C3^AACAC TGAATTTCTC 2760 

TTGTTGTTTT AACTCTGCCA CAAGAATGCA ATTTCGTTAA TGGAGATGAC TTAAGTTGGC 2820 

AGCAGTAATC TTCTTTTAGG AGCTTGTACC ACAGTCTTGC ACATAAGTGC AGATTTGGCT 2880 

CAAGTAAAGA GAATTTCCTC AACACTAACT TCACTGGGAT AATCAGCAGC GTAACTACCC 2940 

TAAAAGCATA TCACTAGCX» AAGAGGQAAA TATCTGTTCT TCrTACTGTG CCTATATTAA 3000 

GACTAGTACA AATGTGGTGT GTCTTCCAAC TTTCATTGAA AATGCCATAT CTATACCATA 3060 

TTTTATTCGA GTCACTGATG ATGTAATGAT ATATTTTTTC ATTATTATAG TAGAATATTT 3120 

TTATGGCAAG ATATTTGTGG TCTTCATCAT ACCTATTAAA ATAATGCCAA ACACCAAATA 3180 

TGAATTTTAT GATGTACACT TTGTGCTTGG CATTAAAAGA AAAAAACACA CATCCTGGAA 3240 

GTCTGTAAGT TGTTTTTTGT TACTGTAGGT CTTCAAAGTT AAGAGTGTAA GTGAAAAATC 3300 

TGGAGGAGAG GATAATTTCC ACTGTGTGGA ATGTGAATAG TTAAATGAAA AGTTATGGTT 3360 

ATTTAATGTA ATTATTACTT CAAATCCTTT GGTCACTGTG ATTTCAAGCA TGTTTTCTTT 3420 

TTCTCCTTTA TATGACTTTC TCTGAGTTGG GCAAAGAAGA AGCTGACACA CCGTATGTTG 3480 

TTAGAGTCTT TTATCTGGTC AGGGGAAACA AAATCTTGAC CCAGCTGAAC ATGTCTTCCT 3540 

GAGTCAGTGC CTGAATCTTT ATTTTTTAAA TTGAATGTTC CTTAAAGGTT AACATTTCTA 3600 

AAGCAATATT AAGAAAGACT TTAAATGTTA TTTTGGAAGA CTTAOGATGC ATGTATACAA 3660 

AOSAATAGCA GATAATGATG ACTAGTTCAC ACATAAAGTC CTTTTAAGGA GAAAATCTAA 3720 

AATGAAAAGT GGATAAACAG AACATTTATA AGTGATCAGT TAATGCCTAA GAGTGAAAGT 3780 

AGTTCTATTG ACATTCCTCA AGATATTTAA TATCAACTGC ATTATGTATT ATGTCTGCTT 3840 

AAATCATTTA AAAACGGCAA AGAATTATAT AGACTATGAG GTACCTTGCT GTGTAGGAGG 3900 

ATGAAAGGGG AGTrGATAGT CTCATAAAAC TAATTTGGCT TCAAGTTTCA TGAATCTGTA 3960 

ACTAGAATTT AATTTTCACC CCAATAATGT TCTATATAGC CTTTGCTAAA GAGCAACTAA 4020 
TAAATTAAAC CTATTCTTTC AAAAAAAAA 

Seq ID NO: 461 Protein sequence 
Protein Accession S ; NP_037S04.1 

1 11 21 31 41 51 

isRTAYTVOA LLLLLGTLLP AAB6KKKGSQ GAIPPPDKAQ HMDSEQTQSP QQPGSRNRGR 60 

GQQRGTAMPG EEVLBSSQEA LHVTERKYLK HDWCaCTQPLK QTIHEBGOIS RTIINRFCYG 120 

QCKSFYIP8H ZRXEE6SFQS CSPCKPKKFT TMMVTLNCPE LQPPTKKKRV TRVKQCRCIS 180 
IDLD 

Seq ID NO: 462 DMA sequence 

Nucleic Acid Accession S: Eos sequence 

Coding se^ence: 1..2733 

1 11 21 31 41 51 

ATGAAAGTTG GAGTGCTGTG GCTCATTTCT TTCrTCACCT TCACTGACGG CCACXXTTCGC 60 

TTCCTGGGGA AAAATGATGG CATCAAAACA AAAAAAGAAC tCATTGTGAA TAAGAAAAAA 120 

CATCTAGGCC CAGTOGAAGA ATATCAGCTG CTGCTTCAGG TGACCTATAG AGATTCCAAG 180 

GAGAAAAGAG ATTTGAGAAA TTTTCTGAAG CTCTTGAAGC CTCCATTATT ATGGTCACAT 240 

GGGCTAATTA GAATTATCAG AGCAAAGGCT ACCACAGACT GCAACA6CCT GAATGGAGTC 300 

CTGCAGTGTA CCTGTGAAGA CAGCTACACC TGGTTTCCTC CCTCAIGCCT TGATCOCCAG 360 

AACTGCTACC TTCACACGGC TGGAGCACTC CCAAGCTGTG AATGTCATCT CAACAACCTC 420 

AGCCAGAGTG TCAATTTCTG TGAGAGAACA AACATTTGGG GCACTTTCAA AATTAATGAA 480 

AGCrrTACAA ATGACCTTTT GAATTCATCT TCTGCTATAT ACTCCAAATA TGCAAATGGA 540 

ATTGAAATTC AACTTAAAAA AGCATATGAA AGAATTCAAG GTTTTGAGTC GGTTCAGGTC 600 

ACCCAATTTC GAAATGGAAG CRTCGrEGCT GGGTA7GAAG TT6TTGGCTC CAGCAGTGCA 660 

TCTGAACTGC TGTCAGCCAT TGAACATGTT GCOGAGAAGG CTAAGACAGC CCTTCACAAG 720 

CTGTTTCCAT TAGAAGACGG CTCTTTCAGA GTGTTCGGAA AAGCCCAGTG TAATGACATT 780 

GTCrrrGGAT TTGGGTCCAA GG ATGATGAA TATACCCTGC CCTGCAGCAG TGGCTACAGG 840 

GGAAACATCA CAGCCAAGTG TGAGTCCTCT GGGTGGCAGG TCATCAGGGA GACTTGTGTG 900 

CTCTCTCTGC TTGAAGAACT GAACAAGAAT TTCAGTATGA TTGTAGGCAA TGCCACTGAG 960 

GCAGCTGTGT CATCCTTOGT GCAAAATCTT TCTGTCATCA TTOGCCAAAA CCCATCAACC 1020 
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ACAGTGGGGA ATCTGGCTTC GGTGGTGTOG ATTCIGAGCa ATATTTCATC TCTGTCACTG 1080 

GOCAGCCATT TCftOGGTGTC CAATTCAACA ATOGAGGATG TCATCAGTAT AGCTGACAAT 1140 

ATOCTTAATT CAGOCTCAGT AACCAACTGG ACAGTCTTAC TGCGGGAACA AAAGTATGCC 1200 

AGCTCACGGT TACTAGAGAC ATTAGAAAAC ATCAGCACTC TGGTGCCTCC GACAGCTCTT 1260 

OCTCTGAATT TTTCTCGGAA ATTCATTGAC TGGAAAGGGA TTCCAGTGAA CAAAAGCCAA 1320 

CTCAAAAGGG GTTACAOCTA 7CAGATTAAA ATOTGTOCCC AAAATACATC TATTOCCATC 1360 

AGAGGCOGTG TGTTAATTGG GTGAGACCAA TTOCAGAGAT GCCTTCCAGA AACTATTATC 1440 

AGCATGGCCT CGTTGACTCT GGGGAACATT CTAOCOGTTT CCAAAAATGG AAATGCTCAG 1500 

GTCAATGGAC CTGTGATATC CACGGTTATT CAAAACTATT CCATAAATGA AGTTCTCCTA 1560 

TTTTTTTCCA AGATAGAGTC AAACCTGAGC CAGCCTCATT GTGTGTTTTG GGATTTCAGT 1620 

CATTTGCAGT GGAACGATGC AGGCTGCCAC CTAGTGAATG AAACTCAAGA CATOCTGACG 1660 

TGCCAATGTA CTCACTTGAC CTCCTTCTCC ATATTGATGT CACCTTTTGT CCCCTCTACA 1740 

ATCTTCCCCG TTCTAAAATG GATCACCTAT GTGGGACTGG CTATCTCCAT TGGAAGTCTC 1800 

ATTTTATGCC TGATCATCGA GGCTTTGTTT TGGAAGCAGA TTAAAAAAAG CCAAACCTCT 1860 

CACACACGTC GTATTTGCAT GGTGAACATA GCCCTGTCCC TCTTGATTGC TGATGICTGG 1920 

TTTATTGTTG GTGCCACAGT GGACACCAOG GTGAACCCTT CTGGAGTCTG CACAGCTGCT 1980 

GTGTTCTTTA CACACTTCTT CTACCTCTCT TTGTTCTTCT GGATGCTCAT GCTTGGCATC 2040 

CTGCTGGCTT ACOGGATCAT CXrTCGTGTTC CATCACATGG CCCAGCATTT GATGATGGCT 2100 

GTTQGATTTT GCCTGGGTTA TGGGTGOCCT CTCATTATAT CTGTCATTAC CATTGCTGTC 2160 

AOGCAACCTA GCAATACCTA CAAAAG6AAA GATGtGTGTT GGCTTAACTG GTCCAATGGA 2220 

AGCAAACCAC TCCTGGCTTT TGTTGTCCCT GCACTGGCTA TTGTGGCTGT GAACTTCXJTT 2280 

GTGGTGCTGC TAGTTCTCAC AAAGCTCTGG AGGCCGACTG TTGGGGAAAG ACTGAGT0G6 2340 

GATGACAAGG CX3WXATCAT COGOGTGGGG AAGAGCCTCC TCATTCTGAC CCCTCTGCTA 2400 

GGGCTCACCT GGGGCTTTGG AATAGGAACA ATAGTGGACA GCCAGAATCT GGCTTGGCAT 2460 

GTTATTTTTG CTTTACTCAA TGCATTCCAG GGATTTTTTA TCTTATGCTT TGGAATACTC 2520 

TTGGACAGTA AGCTGCGACA ACTTCTGTTC AACAftGTTGT CTGCCTTAAG TTCTTGGAAG 2SB0 

GAAACAGAAA AGCAAAACTC ATCAGATTTA TGTGCCAAAC GCAAATTCTC AAAGCCTTTC 2640 

AACCCACTGC AAAACAAAGG CCATTATGCA TTTTCTCATA CTGGAGATTC CTCXX3ACAAC 2700 
ATCATGCTAA CTCAGTTTGT CTCAAATGAA TAA 

Seq ID NO: 463 Protein sequence 
Protein Accession #: Bos sequence 

1 11 21 31 41 51 

i I I I ) ) 

MKVGVLWLIS FFTPTDGHQG PliGKKDGIKT KKELIVNKKK HLGPVBEYQL LLQVTYRDSK 60 

EKRDLRNFLK LLKPPLLWSH GLIRIIRAKA TTDCNSLNGV LQCTCEDSYT WFPPSCLDPQ 120 

MCYLHTAGAL PSCECHLNNL SQSVNFCERT KIWGTFKINE RPTNDLLNSS SAIYSKYAKG 180 

lEIQLKKAYE RIQGFESVQV TQPRNGSIVA GYEWGSSSA SCLLSAIEKV ABKAKTALHK 240 

LFPLEDGSFR VFGXAQCNDI VF6F6SXDDE YTLPCSSGYR QIZTAKCESS GWQVIRETCV 300 

LSLLEELNKN FSMIVGNATE AAVSSFVQlJL SVIIRQNPST TVGNIASWS ILSNISSLSL 360 

ASHFRVSNST MEDVISIADN ILNSASVTNW TVLLREEKYA SSRLIiETIiEN ISTLVPPTAL 420 

PLNFSRKFID WKGIPVNKSQ liKRGYSYQIK MCPQNTSIPI RGRVLIGSDQ FQRSLPETII 480 

SMASIjTLGMI LPVSKNGNAQ VNGPVISTVI QNYSINEVFL FPSKIESNLS QPHCVFWDFS 540 

HLQWNHAGCH LVNETQDIVT CQCTHLTSFS ILMSPFVPST IFPWKWITY VGLGISIGSL 600 

ILCLIIEALP WKQIKKSQTS HTRRICHVMI ALSLLIADVW PIVGATVDTT VNPSGVCTAA 660 

VFFTHFFYLS LFFWMLMLGI LLAYRIILVF HHMAQHLMMA VGFdiGYGCP LIISVITIAV 720 

TQPSNTYKRK DVCWLKWSNG SKPLLAFWP ALAIVAVNFV WLLVLTKLW RPTVGERLSR 780 

DDKATIIRVG KSLLILTPLI* GLTWGFGIGT IVDSQNIjAWH VIFAUUiAFQ GFFILCPGIL 840 

LDSKLRQLLF NKLSALSSWK QTEKQNSSOL SAKPXFSKPP KPLQUXGHYA PSHTGDSSDH 900 
IMLTQPVSME 

Seq ID KO: 464 DNA sequence 

Nucleic Acid Accession #: AB035089.1 

Coding sequence: 9845.. 10219 

1 11 21 31 41 51 

I I 1 1 i I 

GGGCATGCAG CCATCGGGGA AAATCCATAG TGCAGATAAA GCAAGGAGGA AGAAGAAGGA 60 

CAGTTCTAGT AAAAGGGAGA ACATCAATAT AGGATGTTTC TTAGCAATAG AAAAAGAAGG 120 

CCAAGAGGAA TTAGGGAGAG AGTTATAAGA GATCAGCAAG GGGACAGGGT TAGATTTGGT 180 

TTGGTTTGAA AGCATACAGT AAATATGATG TCTGTCXrCTG GCAGTGTTGG CAGAGTAGGA 240 

AGGAGGAAGG GAGGCAAGAG ATAATATCAT TTTCTCTGTG CTCCAACTGT ACTTACATAT 300 

GAGACTATTT CCCTCTCTGC TTTTCAAACC TTACTGGAGT TGTTTTCCCT CATGAAAACC 3 60 

AAGAAAG6AA AGCTAGTTAG TCTTGTTCTG AGGTTGTTCA ATGTATACAT ATCTATATCT 420 

GTAGACAGAA TCCTTGGGAA TACAGTAATT GACATATATT CTGTTATrrO ATGCTTGAAA 460 

AATCTCCTCC ACTAACCAGT TTCCCTATAG ATTGCCACAA GCACATAATA AGAAACAATA 540 

AATAAAATGT TCTCTTGACT TTGTTACTTA ACAATGCTGA GAAAACTTTA CAGCCTTCAT 600 

AAGGAAGTGA GGTCCAGGAA AATCTAGGAG ATATTTCTTA ACCAATCTAT AAAGGCATTA 660 

GTAATGACAG GATATTTCCT GAAAGTGTAA TTTCCCATTG AGGATTTGTT TTTAArTTCT 720 

GGATTCCTG6 AGCCAATGAA GTTGGT6TAT GTTTATGAAA TATCAAGAGA CATAAGTTGG 780 

CAAGTGTTCA TATGCAAAAA CTTCTTGGAA TTTCTGAGTT CTCTGTGGCA ATATATGACA 840 

TCAGGATATG TCCAGTCTCA CACACCAGGA TATGTCCTTT CTAGCCTGTC TATCACATGC 900 

TAGGAGAACT ATTTAGGAAC AGAAAAAAAT GCCTGAAATG ATTTCTCATT TGAACTCATC 960 

CAAGCTTTCT CTAAATTTAA GCAAACTCCT GGTCATTTTC AGTTAGTACC TTTCCTTAAG 1020 

TTCAACCTTC AGGGCAAACC TCCGTGCCTC AGACGTTTAG CCATAGTCTG AAATPCTCTT 1080 

CCATAGATTG GTCCCCTGTA ACCCOGGTTT GTCTCAGCTT GTTATCCTGT TTTTTTCTTC 1140 

CCTCCATTCC CAGGATGAGC TTGTTGCTTC TGTCCTATGA GACATTAGAT TCCTTTTCTT 1200 

TGGTACCGGA GTAAATCCAT CCTACTGCAA TAGAGGAAG6 TCCATTTTTG TCTTATAGOG 1260 

CTGGATGCAG ACTCAGCTGA GAAGAGCATT ATTCATTTTT GGAATTCTTT ATCTCAGATA 1320 

TTTCCTCTTC TTTCTTTTTC TTCTATCTTT GGATTTTTAG TCCATCAACG CCCCATTAGT 1380 

CTATTCCCCG ACTTCAATCA GGGAACTTAT ACCTCTTAAA CTCATTCAGA GACTCAAAAC 1440 

ATATATATTG ATACAGGAGA CCTAAGAAGA GCATGTCTTG GGGGTTGAGG AAACAGGCAG 1500 

GTQAGAAATT TCCAGATTGG AAACACAGCT TCCTTTCTCC CATCCAGCCC CTACTTTCAG 1560 

CCTATGTGTT TCTGGCACCT TGTTGTAGAT AAATCTCCCT TGACTTTGTG ATGTGCTGAG 1620 

AAAACAAACT CAOGGCIGGT GTTAAAAAGG GCCCAT6ACA ATACCAAGTG TTGGGGAGAA 1680 

TGTGGAGAAA TCAGAACTCT ATTCAOGGTC GGTTGGAATG CACACTTGTG CAGAATTCTA 1740 
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TGGAGAAGAG TCTQGCAm OCTCAAAATG TTAACCTGGA TTTACCATAT GACCCAGOCSA 1800 

TTTCATTCAT AGGTTTATAC TCAAAACaAA TGAAGAAATA TGCCATGCAA AAAAATGTAC 1860 

ATGAAAGGTC ACAACATCAT TATTCATAAT AGTAAAAGGA TGGAAACAAC ACAAATGTCC 1920 

ATCAACTTAT GATTAAAGAA AATCTGGTCT ATTCAIAGAA TGGAATATSA TTOGACCACA 1980 

5 AAAAGGAATG ATGTACTGAT CCATGCAATG ATCTOSACAA ACCAJGAAAA TA ACACTAG A 2040 

TTAAAGAAGC CAGTCACAAA AGGACTTACT GTATGATTCC ATTTACCTGA AATGTTTGGA 2100 

ATAGGCAAAT CCATAGAAAC AGGAGGTAGA TTCCTGGTTT CCAGGGTCTC CAGGAACGGA 2160 

AGAATCAAGT ACAAGATTTC TTTTGGAGGT AGTGAAATTG TTGTGGAATG AGATCATGAT 2220 

GATGATAGCA CAACTTTGTG AATATAATAA AATCATTGAA TTGTACAGTT GAATTTATGG 2280 

10 TATATAAATT ATATGTTAAT AAAAAGGGGG TCCACAAAAC AAACAGCCCC CCACICTGGT 2340 

TGTCAGGGAG AIATTGGATT AAATGGOCTT G6ACAACAAC COCTCTOCCT GGOCACAGAC 2400 

ATTCTTCAGA TTACAAGATA TTCCAGQGGA AACACTGGAA TGAGTCTGAA GC XaGGT GCT 2460 

AAACAGAAGG ACCATTGAGA AATGTTGTGA TCCTGACAGG TCAAGCAATT TATTTTTC3GG 2S20 

CTTCATTTTT AAATGTAAAA TTAGAAAGCT GCCATTTAAA ATGGCCCGTC TGTTTCAATT 2580 

15 GCTCTTCTCA GTGTCAGOCT GTTAACTCAA TGTGTTAGTC TGTTTTCATG CTGCTGATAA 2S40 

AAACATACCT GAGACTGGCA AGAAAAAGAG GTTTAATTGG GCrTAGAGTT CCACGTGATT 2700 

GOGGAGGCCT CAGAATCACA GTAGGAG6CA AAAGTTATTC TTACATGGTG GCTGCAAGAG 2760 

AAGATGAGGA AGAAGCAAAA GAAGAAACCC CTGATAAACC CATCGGATCT CCTGAGGCTT 2820 

ATTAACTATC ATGAGAATAG CACAAGAAAG ACOaGCCCCC ATGATT CAAT TAOCTCTACC 2860 

20 TGGGTCCCTC CAATAACATG TGGAAATTCT GGTAGATACA ATTC AAGTTG AGATTTGGGT 2940 

GGGAACACAG CCAAACCATA TCACTCAGCA AGGCAGATAA CTTTCTCACT GAGCCTATGC 3000 

AACAGAAAAC CATCTGGGAT GGTTGTAAGG GGCACAGGAA GTGACTGGTA GGATCACTGC 3060 

CAAAGCTGAG CACTCAGGAG AAGGCAATAG AATCCTATTC TCCATAGTAT GC TATAAGAT 3120 

ACTGAAGTAC ACTTCTTCAC TATCTCTTTG GACTTAGAAT TAGCACTACA TTCCTTGTTA 31 BO 

25 TACAGAAAAA TTACTAAGGA AATTCATAGG ATGACAAAAA CTTTCAGAAC TGAAAAACAG 3240 

GAAATGTAAQ CTTTTTAGTT CTTTGGTATT OGAAGTATGC CTAAAAGACA ATGCAAAATC 3300 

CAAGAAAAGA ATGGTGGGGT TTTTGrrTGT TTGGTTTTGT TTrTGTTTTA CAGCTGGAGT 3360 

AGAATACAAA GGGATGGAGT TOAAACAAAT " GAGAGGAAAT TGGAAtTCTA AXCfT ATTCT 3420 

CATTGGCATT AGAAAGGCAC CTACATGTAT TTCACATGAG CC3GGTGACTG CTGACTTGCA 3480 

30 TTCTTATTTT TTCCCTATAG ATTAAAAAGG AGGTACAATG GTAGAACTGT AATCCTGTCC 3540 

TTTGTCATAA ATTTTCATAT TCATAAAGGT GAGTGTTAGC CCGCrTGTGA AATCTGAACT 3600 

TGAGTAACTT CAAATACTAA CCACAGAGGG AAAGGCAGCA AGAGGAGAGG CATAAATTTA 3660 

GGATCTCACC CTTCATTCCA CAGACACACA CAGCCTCTCT GCCCACCTCT GCTTCCTCTA 3720 

GGAACACAGG TAAGAGCTTC AAGCCTCTCC AGCTTAATAA CATGAATTAT TTT TGAG AAT 3780 

35 AATAATGATA CTGTGTTCTA TATCATGCAT CTCCTGCATT CTGTCTGATT ATATTTTACT 3040 

TATTCTGCCA GAGCAAAATT AAAATACCTA TTTCATCTGA rrTGICCTTT ATCTAAATTG 3900 

CTTAGTTCCA AGTAAACCAA GGCACTTTTA GGAACACAGA GGGAGAGTGC CTTGCAG(3CA 3960 

GAGAGTCTTG AAGGAGATGT CAGGGACGCA TCTTAACAGC TGGTTGGATG TGATCCACAG 4020 

AGGTCTCCTG TTAGCATTCA TTGTAAAGCC ATCCTACCTA GCTCTAGTGT AACCAGCAAT 4080 

40 GAAAGAAAGA TAAAGAGGGT CGATTACTTA TTTACAATAG TCTTTAAAAA CGTAGTTTTG 4140 

TAAGCCTTCT AATTAGGACA TTAATATATT TAATATATGC ACATTGTAGA AAGATTGAAG 4200 

CGTTAAAAAT AAGAGAAAAA CTTTAAATGT CAAAATCTCA CAACCCAGAT ATATCATTTC 4260 

TTTAAGAAAA ITGTACTACA AAATACCATT CCATTTATTA AAGTCATTCT 6ACAGGAATC 4320 

TGATGCTTTT CCAGGAGTTC CAGATCACAT CGAGTTCACC ATGAATTCAC TCAGTGAAGC 4380 

45 CAACACCAAG TTCATGTTCG ATCTGTTCCA ACAGTTCAGA AAATCAAAAG AGAACAACAT 4440 

CTTCTATTCC CCTATCAGCA TCACATCAGC ATTAGGGATG GTCCTCTTAG GAGCCAAAGA 4500 

CAACACTGCA CAACAAATTA GCAAGGTAGC TATCAGCATC ATTACGTTGT CCTGTTGCAG 4560 

TTTTTCrCTG GTTCCGTCGG CTAGCACGCA GATGGTAATA GATGTGGTGG TCTGATGGGT 4620 

AGCACAGGGG GCTGTGCAGG AATTCCCATA ACTGTGAGAC CACTGACTTA AACAGATCTT 4680 

50 TTGAGTAAAG TTTTCTTGTC CCX^CTTCATG TCTCTTCCAG GTTCTTCACP TTGATCAAGT 4740 

CACAGAGAAC ACCACAGAAA AAGCTGCAAC ATATCATGTG AGTCACAGAG CACTCTGATT 4800 

CAGCTTTAGA TCCCTGAACA GGTCATAGTT TAAACCTGGA ACTTCACAAA AACTAAGAAA 4860 

AGGCCAGTTT TAGGGAAAAT CTTGGACACA AAGATTGAGA CATACAGAGT GGGTTGGCAT 4920 

TTCATGGCAC ATAATTATTA TTCCTCATTT CTGOGTTACT AAAAGACAGT CAGCACTGTA 4980 

55 CCTCAGAGCA TAGGTCTGGA TCAGGATAGG CTGGOTTCAG ACTCCAGCTT TGCTCTTCAC 5040 

AAATGATGAA TAAGAGCAQG ACACAACTQC TCX3GAGTCCC AGTGACCTCA TCCCAGAAAA SlOO 

CTAAGGGTAA GAAAAAATCT GACTC3UITAC ATGCAAATAC ATGCAAATGT TTACAACAGT 5160 

GCCTTGCCCA TAAAAGTCAT AATAAATGTT ATTATTATTA TAAAGTAGCT ATAATTATAC S220 

TAATCATAAT AATGTGAAAA TAATTTAATT TTCATTGAGT CATTAATGAG ATT CAG AGGA S280 

60 ATAAGCACAA GTCCAAGTAT ATTTTGGAAA ATGATTGCTA TGGAATATAT TGGTTTAGAG 5340 

CCTTAATAGT GCAAAATGCT TTGCTGGAAG GTAGAAAGTT CTAGATTTAA ACAGGCTTAG 5400 

GTTCAAAACT TGGCACTTCT AATTTATGTC TCTATAAACA GGGTTTTTTT CCCCATTCTC 5460 

TGAGCTTTCT TGTGTTCATC TGAATTGAAC TAAAGACTTA GAGTTACCCA TGTAAAGTCC 5520 

TTAGCCATGG ACCTGGCATA CACTCTTCTT ACGTGCAGAG AAT6ACCATC ATGAGGAAAG SS80 

65 AGCCACAGAT CAGTCAATGT GTCCTACAAG ATAATAGCAC CAACAGGTAT AACAGGGCTT 5640 

CCTGGCATAA TCTATTTAAA ATATCCAACC TTCAACATAC TCGTATCCTT GATGACTGTT 5700 

AGAAGTGAAA TATGGTCCTT GCCCATAAGG AGCTGAGAGT TTAACTGGGA AGCTAAACCT 5760 

AACOCTTTAA ACCAACAAGG AGAAAATCOA CTGGTAGACA GCGCTGCATC TTTACTTCAG 5820 

AAGAGAAAAG ATTGCAGTAC GTTAGAGCAA GAAGAATTTT CTGGAAGAAG TCAAATATAA 5880 

70 GGTGGATTTT GAAGQGTATT TGAGGTGAAA TACACCAATT ATCAGGGAAT AACATCAAAG 5940 

GTCCTCAATG AGACTACCAG CATTTAGGGA CTGATCTAAC AGACTTAGCA TGGGTTTAGT 6000 

ATTTACATTG ATACAGCAAT TGAATGATCT CCTTTTTTGA TGTTTGAAGG TTGATAGGTC 6060 

AGGAAATGTT CATCACCAGT TTCAAAAGCT TCTGACTGAA TTCAACAAAT CCACTGATGC 6120 

ATATGAGCTG AAGATCGCCA ACAAGCTCTT CGGAGAAAAG A(3GTATCAAT TTTTACAGGT 6180 

75 AATTTCACCT GGCCTACCCA CATTTCATTT GCATCCTGAT GTCTGTGTCT CTGAGTGGCC 6240 

AAATGGAAGA AAGCAAGGCA 6ATGAGCCTG GGCGACCCAG GTGGAGAGCA TTTACTCAGA 6300 

GTGCATTAGC TCCATTTCCA CAACTCTCCC CCACTGGAGT GTCCCAGAC2C CCAACGATAC 6360 

ATCACTGAAG TGrGGATTTA GGGATAATCT TGTGATAAAA GAGGAGGTTG TGTAATAGAG 6420 

TGAGTAAGAG TAATAAGTAA TAAGATACCA TCGATAAACT GGCACTGACT CAGTCACATA 6480 

80 CGATACATCT TGGTGGGAAA TGTATGACTA ATGGGATATT ATTGGAATGG GCAGGCTTGG 6540 

GTGAGTTCCT GAGAATAGTT GAGGAAGTAC CAGGAAATAT TGAATGCACA GGATGAAAGA 6600 

CAAAAACAAA 6ATCAGAAAC ATCATGGTTA AAATTACTGG AGAGAAGTCT GAGAAGCAAT 6660 

GAATCTCCTT CAGGGAAGCC TGCTCTGCAG TTTGCAAACC ACAGCCTCTT CTGCTTCTGC 6720 

CTTTTGCCAA GATGATATTG ACCTTCAGTG ACCTCTTTCT TGTGCCAGOC CACATTCCCC 6780 

85 TTTTGCATTG CCTACATGAC ACCTGTATAA AAATATOCAT GGACAGGAGA TACTGCATCT 6840 

ATTCAGGGTC T6GATTCAGC TTACTGTTGT TACAAATAAG TAAGTTTGGT AATATATAGT 6900 

TACATAAATT ACTCCTAATT CCTACTTCTT CCTTCATATC TCAAAGGAAT ATTTACATGC 6960 
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CATCAAGAAA TTTTAOCAGA OCAGTGTGGA ATCTACTGAT TTTGCAAATC CTCCAGAAGA 7020* 
AAGTOSAAflG AJySATtAACT OCTGGGTGGA AAGTCAAACG AATGGTAGGA GAGOCROCCA 7080 
TTATAGAAAC ACCTTTGAGA AACCTATGCC AGTGAGCCTT GTGCTTGACA CTGCATGGGG 7140 
GAACAGGTOT GGGGATTGAC ATGGGTTTGC AGGGAGGGCT GAAG3VGGGCA CTCCAGATGA 7200 
AGGATTTGTC CAAATGAATA TGAAGAGAGC CTAGGGC»GC CAAGGAGGAA ATCACAGGAA 7260 
GCCAATTAGA TGGAAACACA TCTGGAGAAT TATTTGCTTA TOGOCCTGCA T GACR ATAGC 7320 
TTTGTCGATC CCCTGTCTCC GCTCAGACCT ATTTTGACAT CATATCCTTT ACTTTAAATC 7380 
AGACTCAAAT TTTTATGATG AATATTTAAT AGAAAAC3VTT AGAAAfiOGTC TCTOGTCTCC 7440 
TTTACTAATT GGGAAACAAG CAGCTCTCTG GTAAATCACC Cr iTl G TCTC TGAGCTGGAG 7S00 
CTCCCTGCAT CACATCTGTA GCCAATGTGT TCTGCACGGA TTATCACAGC TCPCTTCCCC 7560 
ATCAAGGGCA AAGAGCTTGA CAAAGTCTCC ATTCTACAGA CATCTITCTT ACCTCCCACC 7620 
TCTCATTACA GGCCAAACTT ACAGCAACTC AACATGAGAG TGAATAGGAA GATACCCCCG 7680 
GAAGTAGTGT CIGACAGCAC AGGACATGCG TTTCATATTA CAGAGCTCAA GTCACTCATC 7740 
CTAAAATGCA ATCAGGGOCT CtTi'OCT C rG AATGOGGAOC CC]GTAGTTAA AAAAAAATAA 7800 
AAGTAGGAAG AGGAGGGAGG GAGAAAGGAA AGACACATGT TGGAAGAGTA GACAAAATCA 7860 
GTTTATCAGT ATTCCAAATC AGATGATTGG AGACATTCAT ACAC31GAGAA CXTTGAACTCC 7920 
TTCTCTATCA CAAGAACTGA TGTCTCXZATC AAGGGTAACT TTATACGACT GGAGCCTTGA 7980 
AGAAAGCTGC ATCTGGTGAA CCACTGGTCA GTGAGTCTAA CAATTCAAAG ATCAA AGTCA 8040 
GTGAGrCTCA AGCAGGGATT TGGGTCAATA ATTAAOGATC AGTCAO GAAC ATTTGCAAAG 8100 
CATCTTCCAG ACAAGOCATT TGTAGCTTGT GTAAAAGACT CrTTTATTCr TTCCCTTGCA 8160 
GAAAAAATTA AAAACCTATT TCCTGATGGG ACTATTGGCA ATGATACGAC ACTGGTTCTT 8220 
GTGAACGCAA TCTATTTCAA AGGGCAGTGG GAGAATAAAT TTAAAAAAGA AAACAC1AAA 8280 
GAGGAAAAAT TTTGGCCAAA CAAGGTATTG TCTATATTTT ATTTATATAG TGTAATATGT 8340 
TAATACATGG AATGTTAAAC ATTTCTGATG GAATGTAACA TGATAAGTAA AAAATAAAAA 8400 
TTGrrCATCT CTGTrATTTT GTTGTTTTAC TCTTATAACT TTATTTAGTT AGGAATACCT 8460 
GAAAAACTAT TGTTTCTAAC TCATGGAATT CCTOGGTTAT TTCTTAGAAG AAGAAGGATG 8S20 
TCTTCCTATC TOUTAATAT TATCTTTTTT GTCTTGTGTT TCAOGTCTTA TTTGTTGGAC 8580 
ACATTGATTT ATTGCAGAAT ACATACAAAT CTGTACAGAT GATGAGGCAA TACAATTCCT 8640 
TTAATTTTGC CTTGCPGGAG GATGTACAGG CCAAGGTCCT GGAAATACCA TACAAAGGCA 8700 
AAGATCTAAG CATGATTGTG CTGCTGCCAA ATGAAATGGA TGGTCTGCAG AAG GTAA GAA 8760 
CTTGCATCTA CAACTCTTCC TTCTACTGCX: GGACATTTTT CCAAAGATAC CAAGTTTAAA 8820 
CAAGCyTAAAA GCPTATGACC QAGTTGCCTC AAAATGATGA AAAATTCTAA ATGAGGAATG 8880 
ATGACTCACC TTCATATTAC AAATATTTGA GCATAGGGCC TGACACAAAC TGAAAGCTTA 8940 
GTTTTTGTTT GTTTGTTTGT TTTTATTATT ATTATTATAA TACTTTAAGC TTTAGGGTAC 9000 
ATGTGCACAA TGTGCAGGTT AGTTACATAT GTATACATGT GCCATGCTGG TGTGCTGCAC 9060 
CCATTAACTC ATCATTTAGC GTTAGGTATA TCTCCTAATG CTATCCCTCC CCCCTCCCCC 9120 
CACCCCACAA CAGTCCTCAG AGTGTGATGT TACCTTCCTG TGTCCAAGTG TTCTCATTGT 9180 
TCAATTCCCA TCTATGATTT AATTCCATCT ATGGCTTAGT TAATGATTAA TTTATTAGAG 9240 
TTACATGCAT TGGATATCAA TTTGATGATA TTATTATGCA GCAATTTAAA CTTGACTGGG 9300 
AGAAATATAT ACCAATGTGA GGAAAGTTTA CAAATA66CC GAGTAGAAAA GGGAATACAA 9360 
ATTTAGGAAT TTAGGGAATT ACAATTTAAT AATTGCAATG TGTACTAAAT AATGTATACA 9420 
GAAAAATATG ATGAGCCTAT TAAAAATTGA CACATGTAGT AGGCTGTTGG CACAAGAAAT 9480 
A6TGATACAT ACAGTTCATT GTGTACAAAA TAATGTAATC ATATTTTACA TGTGTATCAT 9540 
ACAGTTGTAT ACATACATAT GTACACATAT ACATATACGT AAAAACATGA TTCTGTTTTT 9600 
ACATACATGT ATATACATAT ACACATATAA CCCAATGTAT TTATATATTC AGGACTCATA 9660 
TTTTACCTAT TAGAATAATA ATGTCTATTA AAGTGAACCT TCTGTATTTC ACATTTATTG 9720 
CCAAAATAAC GAATCTCCAC ATAGTCAATT CATTGTTAAG GTGTATTAGA GATCGACAGT 9780 
TAGTCATATC AGTTTCTTTT TTCCATTTGT ATAGCTTGAA GAGAAACTCA CTCCTGAGAA 9640 
ATTGATGGAA TGGACAAGTT TGCAGAATAT GAGAGAGACA TGTGTCGATT TACACTTACC 9900 
TCGGTTCAAA ATGGAAGAGA GCTATGACCT CAAGGACACG TTGAGAACCA TGGGAATGGT 9960 
GAATATCTTC AATGGGGATG CAGACXTTCTC AGGCATGACC TGGAGCCACX; GTCTCTCAGT 10020 
ATCTAAAGTC CTACACAAGG CCTTTGTGGA GGTCACIGAG GAGGGAGTGG AAGCTGCAGC 10080 
T6OCAC0GCT GTAGTAGTAG TOGAATTATC ATCTCCTTCA ACTAATGAAG AGTTCTGTTG 10140 
TAATCACOCT TTCCTATTCT TCATAAGGCA AAATAAGACC AACAGCATCC TCTTCTATGG 10200 
CAGATTCTCA TCCCCATA6A TGCAATTAGT CTGTCACTCC ATTTA6AAAA TGTTCACCTA 10260 
GAGGTCTTCT GGTAAACTGA TTGCTGGCAA CAACAGATTC TCTTGGCTCA TATTTCTTTT 10320 
CTATCTCATC TTGATGATGA TAGTCATCAT CAAGAATTTA ATGATTAAAA TAGCATGCCT 10380 
TCTCTCTCTTT CTCTTAATAA GCGCACATAT AARTGTACTT TTCCTTCCAG AAAAATTTCC 10440 
CTTOAGGAAA AATGTCCAAG ATAAGATGAA TCATTTAATA CCGTCTCTTC TAAATTTGAA 10500 
ATATAATTCT GTTTCTGACC TGTTTTAAAT GAACCAAACC AAATCATACT TTCTCTTCAA 10560 
ATTTAGCAAC CTAGAAACAC ACATTTCTTT GAATTTAGGT GATACCTAAA TCCTTCTTAT 10620 
GTTTCTAAAT TTTGTGATTC TATAAAACAC ATCATCAATA AAATAATGAC ATAAAATCAT 10680 
TTTTGCTTTA CCTGTTTTCT CTCTGGAAAG GGCAAGTGTC CAGTTACACA TAGGAAAGAT 10740 
AATTTAGAGA TATATTAATC ATATATAAAG GAAAATTAAA AACAGAGTAG TTGATGATGA 10800 
GCCTGGAGTA GAAGGCATAT CCCAGAACAG GAGGAGCCTT GTAAACCACA TAGGAACTTC 10860 
CTATTTTATG CTAAAGGGAT AAGAAACTCA TTACAGGCTT TGATGGTTGT TTGTCAAAGA 10920 
GGGGCATAAA ATTATCATAT CCACATCTAG AAAATACATC TCTOGCTAOG CT6ATATCAA 10980 
TGGATGOGAG GAAAGAACAG TGTGGTTACC ATATATAAAT TAGGAAATCA TTAGAGTATT 11040 
GGGAGTGGAA ATGGAGAGAA AGAAAGAGCC TGGGGGAATT ATTTAGGAAA TAATAGTTAC 11100 
AGAAAGACAT CTAAGTTGCT GACCTATCTG ACTGGATGGA TGGAAGAATA TCTTGTTTCT 11160 
GAGAGAAAAA AAGACTTTGG GrXTAAATTT GTACTTGATG AATTAAGGTA CTTTTAATAT 11220 
TCAAATG6AT TTOCCTGGCA GGCACTTGAA GATATTAGTC TAAATCTCAG AAACA6AATA 11280 
TGATCTGAAG CTCTAAATTT GTGATATTCA ATATAAATAC TTTAGAGTCA TTGGGATAAA 11340 
TATGGTAGTT GTAGCTAAAA GCAAAAATAA GATACTAGGG AGAAAGSATA AAGTTAGAAG 11400 
AAAGAAGAAT CTAGAATTGA CCTTGAAGTA TATCAGCATG TGTAAAGATC AGGAATTGAT 11460 
CATTTTTATT TTCCA6AAAG TAGCTTTTCT TAGGGTTCCA TATTTACTCC CATAGATTCT 11520 
TCCC 

Seq ZD NO: 465 Protein seqaence 
Protein Accession S : BAB21525.1 

1 11 21 31 41 51 

I I ] I i I 

MUSLSEAMTK FMFDLFQQFR KSKENNIFYS PISITSALGM VIiLGAKDNTA QQISKVLHFD 60 
Q V T E MTT EKA ATYHVDRSGN VHBQFQKLLT EFNKSTOAYE IJCIANKLPGE KTYQFLQHYL 120 
DAIKKPyQTS VESTDPANAP EESRKKINSW VESQTNEKIK NL7PDGTIGN 0TTLVLV24AI 180 
YFKGQWEMKF KKENTKBEKP WPMKNTYKSV QMMROYNSFtf FALLEDVQAK VLEIPYRGKD 240 



364 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

LSKXVLLFNE tUGUQKLEEK LTAEKLKEWT SLQNMRETCV DI£L?RFEME ESYDLKDTLR 
TKC2WNIFNG DAOLSGMTHS HdiSVSKVLH KAPVEVTHEG VEAAAATAW WELSSPSTN 
EEFCaiHPFL FPIRQKKINS XLFYGRFSSP 

Seq ZD NO: 466 ONA sequence 

Nucleic Acid Accession #: IIM_001910.1 

Coding sequence : 50 .. 1240 
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360 




41 
I 

GGACTCAGAA 
CAAGGATCCC 
GCACGGAGCC 
GAGTCCTGCT 
TACTTCGGCA 
GGCTCCTCCA 
AGCAGGTTCC 
CAGTATGGAA 
CTAACOGTGG 
GATGCAGAGT 
ACTCCAGTAT 
TACATGAGCA 
CACrCCCATT 
ATTGCACTGG 
GCGATTGTGG 
CAAAACGCCA 
AACGTCATGC 
GCCTACA'CCC 
CTTGACATCC 
TTTTACTCAG 
GGAGGGGCCT 
ATTCTTTACA 
TTGAATTAAG 
ACACTTCACA 
TCATATTTTG 
ATCTCCAAAC 
CAGCCCTGAC 
TCTCTCCAGC 
ATTTTGTCCA 
ATTGTCCCAC 
AGGCCCATAT 
GCACATTTGA 
TATCTTTCCT 
CTAGGTGGTG 
CTCTATTGGT 
TACAAAGTTC 



51 
I 

TGAAAACGCT 
TTCACAGGGT 
AGCTCTCTGA 
CAATGGACCA 
CTATCTCCAT 
ACCTCTGGGT 
ACCCTTCCCA 
CCGGGAGCTT 
TTGGCCAGCA 
TTGATGGAAT 
TTGAC3^CAT 
GTAACCCAGA 
TCTCTGGGAG 
ATAACATCXZA 
ACACAGGGAC 
TTGGG6CAGC 
CGGATGTCAC 
TACTGGACTT 
ACCCTCCAGC 
TCTTT6AC0G 
TGTGTCTGTG 
CCTACAAAAA 
ACCAAACAGA 
CATACACACC 
TATTGATTTT 
ATATGCACAA 
AACCCATCCA 
TCCACATGCT 
TAAATATTTC 
AAATGTTTGG 
ATTGCATTTA 
ACGTTGCTGG 
ATAAAATGGT 
CTGGGTACTT 
AATXTTTAAGA 
AGCATTTT 



Seq ID NO: 467 Protein sequence 
Protein Accession #: HP 001901.1 



KKTLLLLLLV 
SMDQSAKEPL 
QPSQSSTYSQ 
FDGILGLGYP 
FSGSLNWVPV 
IGAAPVDGEY 
HPPAGPLWII. 



11 
I 

LLBLGEAQGS 
INYLDMEYFG 
PGQSFSZQYG 
SX(AVGGVTPV 
TKOAYWQIAL 
AVECANIiNVM 
GDVFIRQFYS 



21 
I 

liHHVPLRRHP 
TISIGSPPQN 
TGSLSGIIGA 
FOMHKAQNLV 
DNIQVGGTVM 
PDVTFTINGV 
VFDRGNNRVG 



31 

I 

SIiKXKLSARS 
FTVIFDTGSS 
DQVSVEGLTV 
DLFMFSVYHS 
FCSBGCQAIV 
PYTLSPTAYT 
LAPAVP 



41 

I 

QLSEFWKSUN 
NLWVPSVYCT 
VGQQFGESVT 
SNPEGGAGSE 
DTTGTSLITGP 
LLDFVDGMQF 



51 

1 

LDMIQPTESC 
SPACKTHSRF 
EPGQTFVDAE 
LIFGGYDHSH 
SDKIKQLQMA 
CSSGFQGLDZ 



Seq ID NO; 466 DNA sequence 

Nucleic Acid Accession «: NM_0180S8.1 

Coding sequence: 319. .1575 



1 
i 

TAOGOGCTGC 
GAGGGCOGGG 
TACAGOGACA 
GTCAAOGTGG 
A6AAAGGGCT 
CCTGATCCCC 
CTCAGAGATG 
GTGGGCCCCA 
AACTTCCTTT 
GTGGACGACC 
AAAGTGGACA 
ACCCATGGGA 
GTCOGCAGGG 
AACATTGCCT 
G6AGAOCCCC 
ACA06GGGTG 
GGAGAGTCCA 
TGGCTGOGAG 
CTCTACACCA 
tGTGAGATGG 
GIGA06TGGC 




CTGGAOGCTA 
TCATTGAAAT 
TGGCTGCTGA 
TCCTCAGCAG 
TCCACAACCG 
CCCACCAGCA 
TOGTCTATGG 
AGGTCCGCTT 
TCATCACCGC 
ACaSCAGCTC 
TCATOGAGGA 
T6GTGACOGA 
TGGCTCAGCC 
TGGTGCCACG 
AGAAGAGTGG 
AGCCCGTGGC 
CAGATGGCAA 



21 
1 

GGGGAACGCC 
CTTOCTCAAC 
GTTCOGCAAT 
GGCCAGCCTC 
CTCTATCTAC 
GGACCCTGAG 
GGCTGGGGTC 
CAGTGCCTCX; 
GGGCGATGGC 
TGGG06AGGT 
CAACTGGAAT 
CCGGGACATC 
CGACTTTGAC 
CTCAGCCAAC 
GCTCAATCCC 
CTTOGAOGGA 
GCTGTCOGTC 
CACCCGGGTT 
GGCCCACCTG 
ACACTTTGGC 
GAT0GT6AGC 



31 
i 

ATGGGGGTCA 
AOCAATAATG 
AAGOGGTGGG 
TTTGCOGGAC 
ATTGCCAATT 
GCCAGTGACC 
AGCAAATATA 
GATATCTTCr 
ACCTTTGTGG 
GTCGCCCTGG 
GGCCCCCACC 
GCCTCACCCA 
AATGACCAGG 
OGCCTCTTCC 
QGOGACGGCT 
GAOGGGATGC 
TTCOGGGGCA 
GGGGCCTTTG 
AGGATCATCG 
CTGGGGAA(% 
OSGAACGTGG 



41 
I 

CAGCCTGCGA 
CCTTCTCGGG 
AAGACATCCT 
GCTCTGTGGC 
ACGCCTACGG 
TCTCCCGGGG 
CAGGGGGCCG 
GCGACAATGA 
AOGCTGCGGC 
CTGACTTCAA 
GCCTCTATCT 
AGTTCTCCAT 
AGCTGGAGAT 
GCGTCATCCG 
TGGAGCCTGA 
TGGACCTCAT 
ATCAGGGCTT 
CCAGGGGAGC 
AOGGGGGCTC 
ATGAAGCCAG 
CCACOGGGGA 




CTTCTTCAAC 
TAGAGAGCAC 
GGGCC GGGGC 
CTTGTCOCAT 
CAACAACAAC 
TAAGGTOGTG 
AGGCTACCTG 
CA6TGTGGAG 
GATGAACTCA 
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GTGCTGGACA TOCTCTACOC 
ACACCAATGA ATGCATCCAG 
ACAGCTATGG AAGCTACAGG 
AOGAGGATGG CACAf3CCTGC 
CCCCCACOGC TGCTGCTGCC 
CACOGCTCCT CGTAGATGGA 
OCAGCTGCTG AGCAGGGGTG 
AAGTCSGGCTT GTGCTGCTGC 
CCCAAGCCCA TCCATGCACA 
CTGTGCTGGG CACATAGCTG 
ATTCCAGTGG GTCTAATGAC 
CTGCACAGGA AGTATGAGGA 
AAAGCTATGT CACCTTACAC 
AAATGGGGAT TAAGAATAGA 
GACACTTQGC ACAAAACCTG 
GGGCTTTGTC AACAOGTG 



Seq U) HO: 469 Protein sequence 
Protein Accession #: IIP_060S28.1 
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OOSGGATGAC 
TTCCCATTOG 
TGCOGGACCA 
GTGGGGACTC 
ACTGCCX3CTG 
GATCTCAATC 
GGACATGAAC 
CTAGACAGTA 
TTACTTAGCT 
TGATCACAGC 
CATATCTTAG 
CTTTAGTGTC 
CAGTCACTTA 
ATCTTGGGGT 
GCACATAGTA 



GACACACTTC 
TGTGCCCTOG 
ACAAGAAGTG 
TGGGCCAGTC 
CTGCT60CGC 
TGGGGTOGGT 
CAGOGGAtGG 
GGQATGTAAA 
AACAATTAGG 
AGACAGGGTC 
GACACAGATG 
CTGAGTTCAA 
ACTTGTTAGC 
TACTGTGGAG 
AAGGCTCAAT 



AOGACOCAGC 
AGACAAGCCX: 
CAGTCGGGGC 
AOCGGGOCCC 
TGCTGGAGCT 
GGTTAAGGAG 
AGTCCAGCAG 
GGCCTGGGAG 
GAGACTOGTA 
GCTGCGCTGA 
TGGCCAGGGA 
ATCCTQATTC 
CATCCATTAT 
ATTAGATTAA 
AAAAACAAGT 



KDPEASDLSR 
RGDGTFVDAA 
FRDIASPKFS 
BLNPGDALEP 
RTRVGAFAKG 
K»4VSRNVASG 
GAGPTRSAVG 



11 
I 

GIIiALRDVAA 
ASAGVDOPBQ 
MPSPVRTVIT 
BGRGTGGWT 
AKWLYTKKS 
^INSVIiEZLy 
ATSPTRMAQP 



21 
I 

EAGVSKYTGG 
HGRGVALADP 
ADFDNDQELE 
DFDGDGMUDL 
GAHLRIIDGG 
PRDEDTLQDP 
AHGLSASHRA 



31 
I 

RGVSVGPILS 

NSixacvDivy 

IFFSNIAYRS 
ILSHGBSMAQ 
SGYLCEMEPV 
APLETPMNAS 
PAPPPPPIiLL 



41 

I 

SSASDIFCDN 



SSAKRLFKVI 
PLSVFRGNQG 
AHFGLGKDEA 
SSKSCALETS 
PLPbXiLPLLE 



CCCACTGQAG 
GTATGTGTCA 
TACGAGCCCA 
CGCCCCACCA 
GCCACTGCTG 
AGCTGOGAGC 
GOGAGTGGGA 
CTAGACCCTC 
AGGCCRGGCC 
TGGOGCTTAC 
GGT G GTOTCA 
AGGAACTCAC 
CGCATCTGCA 
ATGTATGTAA 
GCCrCTCACT 



51 
I 

ENGFHFLFHN 
LQMSTHGKVR 
RREH(a)PIiIE 
FNNNWLRWP 
SSVEVTWPDG 
PYVSTPMEAT 
hPhUBRSS 



Seq ID NO: 470 DNA sequence 
Nucleic Acid Accession »: AJ2790lfi 
Coding sequence: 1..1962 



1 
I 

ATGTCCAGGA 
CAGCX3GGCTG 
AGTAATCCCA 
TTTGAGATCG 
CAGAAGCGGC 
GACCGGCAGG 
GAGATCTACT 
TTGTTCAAGT 
CGTGGTGTGG 
GGACGCTACT 
ATTGAAATGG 
GCTGCTGAGG 
CTCAGCAGCA 
CACAACOGGG 
CACCAGCATG 
GTCTATGGCA 
GTCOGCTTCX: 
ATCACCX3C06 
CGCAGCrCCT 
ATCGAGGAGC 
GTGACOGACT 
GCTCAGCOGC 
GTGCCAOGCA 
AAGAGTGGGG 
CCOGTGGCAC 
GATGGCAAGA 
CTCTAOCCCC 
TTCTCCCAGC 
GTGTGOCCTC 
AACAAGAAGT 
CTCGGCCAGT 
GCTGCTGCC6 



CCAGCGGATG 
AGGGATGTAA 
TAACAATTAG 
CAGACAGGGT 
GGACACAGAT 
CCTGAGTTCA 
AACTTGTTAG 
TTAGTGTGGA 
AAAGGCTCAA 



11 
1 

TGTTACOGTT 
AACCCATGTT 
OXAGCTCAA 
TCGTGGCGGG 
TGGTGAACAT 
GGAACGCCAT 
TCCTCAACAC 
TCCGCAATAA 
CCAGCCTCTT 
CTATCTACAT 
ACCCTGAGGC 
CTGGGGTCAG 
GTGCCTCGGA 
GCGATGGCAC 
GGCGAGGTGT 
ACTGGAATGG 
GGGACATCGC 
ACTTTGACAA 
CAGCCAACCG 
TCAATCCCGG 
TCGACGGAGA 
TGTCCGTCTT 
CCCGGTTTGG 
CCCACCTGAG 
ACTTTGGCCT 
TGGTGAGCCG 
GGGATGAGGA 
AGGAAAATGG 
GAGACAAGCC 
GCAGTCGGGG 
CACCGGGCCC 
CTGCTGGAGC 
TGGTTAAGGA 
GACTCCAGCA 
AGGCCTGGGA 
GGAGACTCGT 
CGCTGCCCTG 
GTGCCCAGGG 
AATCCTGATT 
CCA7CCATTA 
GATTAGATTA 
TAAAAACAAG 




CCCCCACCGC 
CTCACCCAAG 
TGACCAGGAG 
CCTCTTCCGC 
CGA06CCTTG 
CX3GGATGCTG 
CCGGGGCAAT 
GGCCTTTGCC 
GATCATCGAC 
GGGGAAGGAT 
6AACGT6GCC 
CACACTTCAG 
CCATTGCATG 
CGTATGTGTC 
CTACGAGCCC 
COGCCCCACC 
TGCCACTGCT 
GAGCTGCGA6 
GGG6AGTGGG 
GCTAGACCCT 
AAGGCCAGGC 
ATGGCGCTTA 
AGGTGGTGTC 
CAGGAACTCA 
TOGCATCTGC 
AATGTATGTA 
TGCCTCTCAC 



CTCTGGTTTC 
ACCAACTCAG 
GCAGTTACTG 
CCCAACCTGG 
GAGCGCAGCT 
GCCTGCGACA 
TTCTCGGGGG 
GACATCCTGA 
TCTGTGGCCT 
GCCTACCGTA 
TCCCGGGGCA 
GGGGGCCGAG 
GACAATGAGA 
GCTGC6GCCA 
6ACTTCAACC 
CTCTATCTGC 
TTCTCCATGC 
CTGGAGATCT 
GTCATCOGTA 
GAGCCTGAGG 
GACCTCATCT 
CAGGGCTTCA 
AGGGGAGCTA 
GGGGGCTCAG 
GAAGCCAGCA 
AGCGGGGAGA 
GACCCAGCCC 
GACACCAATG 
AACACCTATG 
AACGAGGAT6 
ACCCCCACCG 
GCAC06GTCC 
CCCAGCT6CT 
AAA6TGGGCT 
CCCCAAGCCC 
CCTGTGCTGG 
CATTCCAGTG 
ACTGCACAGG 
CAAAGCTATG 
AAAATGGGGA 
AGACACTTGG 
TGGGCTTTGT 



41 
I 

TGCCCATCAC 
TTCTGCCTCC 

ATGTGGACCA 
TTCTGAAGTA 
CACCCTACTA 
TOQAOSGGGA 
TGGCCAOGTA 
GGGATGAGGT 
GTGTGGACAG 
ATGTGGGCX:C 
TTCTGGOGCT 
GCGTCAGCGT 
ATGGGCCTAA 
GTGCTGGTGT 
GTGATGGCAA 
AAATGAGCAC 

ccTcccxrrGT 

TCTTCAACAA 
GAGAGCACX3G 
GCGGGGGCAC 
TGTCCCATGG 
ACAACAACTG 
AGGTCGTGCT 
GCTACCTGTG 
GTGTGGAGGT 
TGAACTCAGT 
CACTGGAGTG 
AATGCATCCA 
GAAGCTACAG 
GCACAGCCTG 
CTGCTGCTGC 
TOGTAGATGG 
GAGCAGGGGT 
TGTGCTGCTG 
ATCCATGCAC 
GCACATAGCT 
GGTCTAATGA 
AAGTATGAGG 
TGACCTTACA 
TTAAGAATAG 
CACAAAACCT 
CAACAOG 



Seq ID NO: 471 Protein sequence 
Protein Accession If: CAC08451 
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51 
I 

TQAGGGGTCC 
TGACTATGAC 
TGATGGGGAC 
TGACCGGGCC 
CGCGCTGCGG 
CGGCCGGGAG 
CACCGACAAG 
CAACGTGGCC 
AAAGGGCTCr 
TGATGCCCTC 
CAGAGATGTG 
GGGCCCCATC 
CTTCCTTTTC 
GGAGGACCGC 
AGTGGACATC 
CCATGGGAAG 
CCGCACGGTC 
CATTGCCTAC 
AGACCCCCTC 
AGGGGGTGTG 
AGAGTCCATG 
GCTGQGAGTG 
CTACACCAAG 
TGAGATGGAG 
GACGTGGCCA 
GCTGGAGATC 
TGGCCAAGGA 
GTTCCCATTC 
GTGCCGGACC 
CXrrGGGGACT 
CACTGCCX3CT 
AGATCTCAAT 
GGGACATGAA 
CCTAGACAGT 
ATTACTTAGC 
GTGATCACAG 
CCATATCTTA 
ACTTTAGTGT 
CCAGTCACTT 
AATCTTGGGG 
GGCACATAGT 



60 
120 
180 
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300 
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540 
600 
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11 



21 31 41 SI 

} I I I 

MSRMLPFLLL LWFLPITEGS QRAEPMPTAV TNSVLPPDYD SNPTQLMYGV AVTOVDBDGD 
FEIWAGYNG PNLVLFODRA QiCRLVMIAVD ERSSPYYALR DBQGSNAIGVT ACDIDGDGRB 
EIYFLNTKNA FSGVATYTDK LPKPRHNRWB DILSDEVNVA 8GVASLPAGR SVACVDRKGS 
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GSYSIYIANY AXSnVCPDAL laOPEASDL 
LSSSASDIFC DMBiSTIIPLP HNRGDGTPVD 
VYOnOIGPHR LYUIHSTBGK V8FBDIASPK 
RSSSAMRbPa VISREHCSFb IGELMPGOMi 
AQPLSVFaGN QGFtnlNWIJlV VPRTRFGAFA 
PVAHFGLGKD EASSVEVTWP tX5KMVSHSVA 
PSQQSNGHCM DTKECIQFPF VCKIDKPVCV 
U3QSCGPRPT TPTAAAATAA MAAM3AATA 



Seq ID NO: ii2 UNA sequence 
Nucleic Acid Accession Is 
Coding sequence I 1..4794 



PCTAJS02/12476 



SRGIIALRDV 
AAASAGVDDP 
FSKPSPVRTV 
EPBGRGTGGV 
RGAKWLYTK 
SGBMHSVLBI 
NTYGSYRCRT 
APVLVXXSDLE7 



AAEAGVSXYT 
HQKGRGVALA 
ITADFD!<DQS 
VTDPDGDGXL 
KSGAHLRXIO 

NKKCSRGYEP 
X/sSWKESCB 



GCRGVSVGPI 
DFNBDGKVDI 
LEIPFNNIAY 
DULSHGBSM 
GGSGYLCEME 
DPKBhECCQQ 
NSXX3TACVGT 
PSC 



240 
300 
360 
420 
480 
540 
600 



1 
I 

ATGGCGTGTC 
AGCGGCTCCT 
GTTCTGAAGT 
TCACCCTACT 
ATCGACGGGG 
CACAGCAGCT 
CCACCTACAA 
TCCTCCCTGG 
TGTCGGGGTG 
GGGGTGGCCA 
CTGAGCGATG 
GCCTGTGTGG 
GGTAATGTGG 
GGCATTCTGG 
TTCTCCCACA 
GGAGGAGACC 
TGCOGGCTGG 
CAGAGGGAGG 
TCCAAAAGCC 
GOGOCTTCTC 
CCCCTTGTCA 
CCCCACCCCC 
CTGATGGCTG 
CTGAGAA6CT 
GAGCTGGGACi 
CTGGGAGAAC 
CCCAAGGTCA 
GGCCCOGGGA 
CTCTCCCATC 
GTGCCGGGAG 
CTGGCGTGGA 
TTTAGGCTCA 
CTGCAGTTCC 
TCTGCCACTC 
ATCCTCAGCA 
TTCCACAACC 
GCCTTCATCG 
CTAGCAGAAA 
CCACATTGCC 
TTCTTGACGC 
CAGGGGGCCC 
ACTGCCTATT 
TTGTCCTCTG 
GCCCTGGCTG 
CCCCAC06CC 
TCACCCAAGT 
GACCAGGAGC 
CTCTTCCGAT 
GGTCAGGGAG 
AAGGTCAACA 
AGAGGCTGTG 
AAAGGGAAGG 
CCACACTACC 

gtccaatcac 
<:ggggtccaa 
gctacgggct 
aggggctacg 
agaaaggggc 
ccaggaaaag 
actaccagga 

A^iTCACTACC 

gtccaatcac 
cggggtccaa 
gctatggggt 
aggggctatg 
agaga6cacg 
ggccggggca 
ttgtcccatg 
aacaacaact 

AAGGTCGTGC 
GGCTACCTGT 
AGTGIGGAGG 



11 
I 

OGGGAGGACT 
CCCCAGCATC 
ATGACOGGGC 
AOGCGCTGCG 
ACGGCCGGGA 
CAGOGCAGGT 
CCCCT6CAGG 
GTCAGGCTTC 
GACTGAGACC 
CGTACACCGA 
AGGTCAAOGT 
ACAGAAAGGG 
GCCCTGATGC 
CGCTCAGAGA 
CTGCCTCTCC 
CAGAGGAGGC 
GCTGGAAGGA 
CTGGGGCAGC 
ATTTGGCTGA 
CAGCCCACCC 
CTCAGCTAAT 
GAGCCCCAGG 
AGGCTTTGGG 
GGGAGGAAAG 
GTCCCTGGAG 
CTCCCATTTT 
CACAGGAGTG 
GGGTGGCCAA 
CCCTGGTCCC 
CTGCCCTGCC 
ACCAGATGGA 
GGAAAGCAGG 
CCTCAGGCCT 
ACTGTGGGTC 
GCAGTGCCTC 
GCGGCGATGG 
TTCACCTCAA 
CTGGTCCTTC 
ATCATGGTTT 
AAGGCTTGGC 
CACCCTGCCT 
ACATTGTCCT 
AAAGAGTCAA 
ACTTCAACCG 
TCTATCTGCA 
TCTCCATGCC 
TGGAGATCTT 
GCTCCATCCT 
AAGGTTTAAG 
CAGGTCCCCT 
GGAATGCAGG 
GAAATGTGGC 
AGAAAAAGOG 
TACCAGGAAA 
TCACTACCAG 
CCAATCACTA 
GGCTCCAATC 
TAOGGGCTCC 
GGGCTACAGG 
AAAGGGGCTA 
AGGAAAAGGG 
TACCACA6AA 
TCACTACCAG 
CCAATCACTA 
GGGTCCAATC 
GAGACCCCCT 
CAGGGGGTGT 
GAGAGTCCAT 
GGCTGCGAGT 
TCTACACCAA 
GTGAGATGGA 
TGAOGTGGCC 



21 
I 

CCCAGCCOGT 
CCCTCCCCAT 
CCAGAAGCGG 
GGACCGGCAG 
GGAGATCTAC 
CCCTTCTGGG 
CCTCCTGGGT 
TCCGGACAGC 
TACCCATGAA 
CAAGTTGTTC 
GGCCCGTGGT 
CTCTGGACGC 
CCTCATTGAA 
TGTGGCTGCT 
AAGCATTGGT 
AGATGAGGAG 
CGGGCAGTTC 
TGGCGTGCCC 
C3UVGAACCTA 
TTTCCCTGCC 
GACACATGGA 
AATGGACCCC 
CGCGTGGCCA 
CAGGCAGAAG 
CCAAGCCACA 
ACAAAGAACA 
CCATCTAGTG 
GC6AGAGATT 
CAACTTCCCC 
TGGGAATCCT 
AAAAGAGGAG 
GGAAGCAGAA 
CAGAGGCAGC 
GATGTCTTTT 
GGATATCTTC 
CACCTTTGTG 
ATATCACCTC 
CTCCTCCTGC 
GTCTATGAGC 
CTCCAGTGCC 
TCTGGCAAGA 



31 



41 



I 



CGTGGGTGTG 
TGATGGCAAA 
AATGAGCACC 
CTCCCCTGTC 
CTTCAACAAC 
GGCTOGTGGC 
AATCAGAAGG 
GATGAAGAAA 
GCAAAGCCTG 
CCAAAGTGTG 
GCTACAG66T 
AGGGGCTACG 
GAAAAGGGGC 
CCAGGAAAAG 
ACTACCAGGA 
AATCACTACC 
GTCCAATCAC 
GGG6CTCCAA 
GCTACAGGGT 
AGGGGCTACG 
GAAAAGGGGC 
CCAGGAAAAG 
ACTACCACAG 
CATCGAGGAG 
GGTGACCGAC 
GGCTCAGCCG 
GGTGCCACGC 
GAAGAGTGGG 
G000GTG6CA 
AGATGGCAAG 



TGCTCTGGTT 
TCCTCCTCCA 
CTGGTGAACA 
GGGAACGCCA 
TTCCTCAACA 
CTCCACAGAA 
CTGCCTCCAC 
AGGCAGGGAG 
CCAGAACCAT 
AAGTTGGGCA 
GTGGCCAGCC 
TACTCTATCT 
ATGGAGCCTG 
GAGGCIGGGG 
GAGATATCTG 
CACAGTGGGG 
AAGGAAGAAG 
AGAGGAOGTG 
TTTCisCCCAC 
CGCCAAGCCC 
CCTCTGGCTG 
AAATOTAAGG 
GCGCTCAGCA 
GGGCAGGCCA 
CAGCACCTGC 
GACGGAGATC 
GCCACCATGC 
GGGAGAGAGA 
AGCTGCTTGA 
GGGAACTGGG 
GGGAAGATTC 
TTCCCCCCAG 
CCTGTCCTCC 
CTAGGGGGCC 
■TGGGACAATG 
GAGGCIGGGG 
TGCAGAGATT 
TGCCOGTGGC 
TTTACAAGGA 
CACCGGAGGA 
GCTCCCTGTG 
ATCCCAGAGA 
GAOGACCCCC 
GTGGACATOG 
CATGGGAAGG 
CGCACGGTCA 
ATTGCCTACC 
TCTTCATOCT 
GGAGGGTTCC 
CAGAAAGGAA 
GCCAAGGAGC 
CCCAGAACCC 
CCAATCACTA 
GGGTCCAATC 
TACGGGGTCC 
GGGCTACAGG 
AAAGGGGCTA 
AGGAAAAGGG 
TACCAGGAAA 
TCACTACCAG 
CCAATCACTA 
GGCTCCAATC 
TACGGGCTCC 
GGGCTAOGGG 
AAAGGGGCTA 
CTCAATCCOG 
TTOGACOGAG 
CTGTCCGTCT 
ACCOGGTTTG 
GC CCACCT GA 
CACTTTGGCC 
ATtSGTGAGCC 



51 
I 



GGATGGGACT GGGTGGGCCC 
GGTACAATGG ACCCAACCTG 
TCGGGGTOGA TGAGCGCAGC 
TGGGGGTCAC AGCCTGCGAC 
CCAATAATGC CTTCTOGGGC 
ACAGGOCTGT GCTGAAGCCT 
TCAGCGGAAG GGACTTTTCC 
AGAGGGTGCC GGTTCCCTGC 
TTCTTCTGAG ACCCAAATCA 
ATAAC06GTG GGAAGACATC 
TCTTTGCOGG AOGCTCTGTG 
ACATTGCCAA TTACGCCTAC 
AGGCCAGTGA CCTCTCCCGG 
TCAGCAAATA TACAGAAGGC 
GCAGAACOGA GGAGOGGGAA 
ATGGAAGCAC CAGCCAACTG 
CAGCAGCTTT GGTGGAGGAA 
TTGGAACAGC TCTGCAGACT 
CATGTTACTA TTCTGTCTGC 
CCCAACACTA CCCTGTAGCC 
GAAAACTAGC CCGGAGTGTC 
GCOGCCATGC TGAGCOCGGC 
CCACTGTGGT GCCAGGGGGC 
TGTCCAGATG TGCACTCAGG 
CTGCTAGAGA GCTGTATGAC 
CAGGGAGGAG AAGGGACTCG 
CAGCTCTCGG GGGACTCGAG 
CIGGGGCAGT AGGAAGACCA 
GGCCTCTTGA AGCOGGGACA 
TTCTGGACAT GGCCAAGGCC 
ATGGAGACCA TGAGCCCAGA 
GCTCCTCTGA GGAGCCTCTG 
AGGTGGGCCT GGGGCTTGCT 
GAGGCGTCAG CGTGGGCCCC 
AGAATGGGOC TAACTTCCTT 
CCAGTGCTGA ACGTCGTTTA 
TTCCTCACTC CCTGTGCCAC 
ATGCACGTCT TCTTCAGGCT 
COGGGTCACG GTTCTATTCA 
CRCTCAGCCT CCAGGGTTCT 
TCCTGGGGTC TCTGATCCCC 
GCCTGATGAC CCACAGCTAT 
ACCAGCATGG GC6AG6TGTC 
TCTATGGCAA CTGGAATGGC 
TCCGCTTCCG GGACATCGCC 
TCACCGCCGA CTTTGACAAT 
GCAGCTCCTC AGCCAACCGC 
TGACftGCTGG T6G6AGGAAC 
CAGGGCCAG6 GGGTCAGGCC 
GGAAGGAOGA GGACTGGGCA 
CGGCCTCTGC TATTGCAGGG 
AAGOGCCACA AGATACAAAG 
CCAGGAAAAG GGGCTAGGGG 
ACTACCAGGA AAAGGGGCTA 
AATCACTACC AGGAAAAGGG 
GTCCAATCAC TACCAGGAAA 
CAGGGTCCAA TCACTACCAG 
GCTACGGGGT CCAATCACTA 
AGGGGCTACG GGGTCCAATC 
GAAAAGGGGC TACGGGGTCC 
CCAGGAAAAG GGGCTACAGG 
ACTACCAGGA AAAGGGGCTA 
AATCACTACC AGGAAAAGAG 
GTCCAATCAC TACCAGGAAA 
CGGGGTCCAA CGTCATCCGT 
GCGACGCCTT GGAGCCTGAG 
ACGGGATGCT GGACCTCATC 
TCCGGGGCAA TCAGGGCTTC 
GGGOCTTTGC CAGGG6AGCT 
GGATCATC6A OGGGGGCTCA 
TGGGGAAGGA TGAAGCCAGC 
GGAAC6TGGC CAGGGGGGA6 



60 
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240 
300 
360 
420 
480 
540 
600 
660 
720 
780' 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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1680 
1740 
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1660 
1920 
1980 
2040 
2100 
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2520 
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2640 
2700 
2760 
2820 
2860 
2940 
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3060 
3120 
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4140 
4200 
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AK^CT<^ TCC^GAGAT CCTCTACCCC CGGGATGACG ACACACTTCA GGACCCRGCC 4380 

CCACTOGAGT GTCGCCAAGG ATTCTCCCAG C3U3GAAAATG GCCATTGCAT GGACACCAAT 4440 

GAATGO^TCC AGTTCCCArr CGTGTGCCCT CGAGACAAGC CCGTATGTGT CAACACCTAT 4S0O 

GGAAGCTACA GGTGCCGGAC CAACAAGAAG TGCAGTCGGG GCTAOGAGCC CAAOGAGGAT 4560 

GGCACAGCCT GOGTGGGTAC TGAGCTAGGC TCTAGGCATA CAATGACXTTG GAAACCAAGG 4620 

CCCAAAAAGG AGCTGCAACT TTOCCAAjGGC ATCTGCRCCC COJfCTGGTC C WITIDLTG 4680 

CCGGGTroCC GGCIGCrCXrT CAAAAGAGCT CAGCTCCAGG CTGCTCCCRG CACOCTTCTC 4740 
CAGAAAGCTC CAGGTATTCC AGAAGCCCAA GTGTATGAAC AAGATCAGGA ATAA 

Seq ID tlO: 473 Protein sequence 
Protein Accession i: PGEKBSH predicted 

1 11 21 31 41 ■ 51 

ilACPGGLPAR CSGWKGLGGP SGSSPASPPH SSSRYNGPNL VLKYDRAQKR LWIAVDERS 60 

SPYYALROKO GNAIGVTACD XDGDGHEEIV FLNTNTIAFSG HSSSAQVPSG LHRNRPVLKP 120 

PPTTOAGI^ LPPLSGRDPS SSLGQASPDS RQGEKVPVPC CRGGLRPTHE PEPFUJIPKS 180 

GVATYTOKLF KFRNNRWEDI LSDEVNVARG VASI^FAGRSV ACVDRKGSGR YSIYIAWYAY 240 

GNVGPDALIE MDPEASDLSR GILALRDVAA BACVSKTrBG PSHTASPSIG EISG^BM 300 

GGDPEEADEE HSGDGSTSQL CRLGWKDGQP KEEAAALVEE QREAGAAGVP RGRVRTAIflT 360 

SKSHLADKNL FCPPCVYSVC APSPAHPFPA RQAPQHYPVA PLVTQLMTHG RLAGKLARSV 420 

PHPRAPGMDP KCKGRHAEPG LMAEAIiGAWP ALSTTWPGG LRSWEESRQK GQAMSRCALR 480 

ELGGPHSQAT CHLPARELYD LGEPPILQRT DGDPGRRRDS PKVTQECHLV ATKPALGGLE 540 

GPQIVaSeI GRETGAVGRP LSHPLVPNPP SCLBPLBAGT VPGAALPGNP GNWVLDMAKA 600 

LAWOffiKEE GKIHGDHEPR FRLRKAREAE PPPGSSEBPL LQPPSGUIGS PVLQVGLGLA 660 

SATHCGSMSP LGGHGVSVGP ILSSSASDIF CDNESGPNFL FHNRGDGTFV DAAASABRRL 720 

APIVHLKYHL CHDFPHSLC3I LAETGPSSSC CPWHARLLQA PHCHHGLSMS FTRTGSRP^ 780 

FLTOGIASSA HRRTLSLQGS QGAPPCLLAR APCVLGSLIP TAYYIVLWSA IPESLKTHSY 840 

' LSsSSvGV ODPHQHGRGV ALADPNRDGK VDIVYGNWNG PHRtYLQMST HGKVRFRDIA 900 

SPKFSMPSPV RTVITADFDM DQELEIFFNN lAYRSSSANR LFRCSILARG SSSLTAGGRN 9S0 

GQGEGLRIRR GGPPGPGGQA KVNTGPLMKK QKGRKDEDWA RGCGKAGQSL AKEPASAIAG 1020 

RGKCaiVAQSV PRTQAPQDTK PHYHKKGLQG PITTRKRGYG VQSLPGKGAT GafflYQ^ 1080 

RGPITTRKRG YGVQSLPGKG ATGSNHYQEK GLQGPITTRK RGYGLQSLPG KGW^NHYH 1140 

RKGLRAPITT RiCRGYGVQSL PGKGATGSNH YQEKGLRGPI TTRKRGYGLQ SbPGKGATGS 1200 

HHYQEKGLQG PITTRKRGYR VQSLPQKGAT GSNHYQBKGL RGPITTRKRG YGLQSLPGIffi 1260 

AMGSNHYQEK GLRAPITTRK RGYGVQSLPQ KGATGSNVIR RBHGDPLIEE LNPGDALEPE 1320 

GRGTGGWTD FDOXaiLDLI IiSHGESMAQP hSVFBGSQG? NNNWLRWPR TRFGAFARGA 1380 

KWLYTKKSG AHLRIIDGGS GYLCEMEPVA HFGLGKDEAS SVBVTWPDGK MVSRNVASGE 1440 

fflNSVLEILYP RDEDTLQDPA PLECGQGFSQ QENGHCHDTN ECIQFPFVCP RDKPVCVNTY 1500 

GSYRCRTKKK CSRGYEPNBD GTACVGTELG SRHTMTWKPR PKKBLQLSQG XCTPVWSFFL 1560 
PGCRLLLKRA QLQAAPSTLL QKAPGIPEAQ VYEQDQE 

Seq ID NO: 474 DNA Sequence 

nucleic Acid Accession #» NM_003661.1 

Coding sequence: 1..1152 

1 11 21 31 41 51 

i I I 1 I ' 

ATGACTGCAC TTTTCCTTGG TGTGGGAGTG AGGGCAGAGG AAGCTGGAGC GAGGGTGCAA 
CAAAACGTTC CAAGTGGGAC AGATACTGGA GATCCTCAAA GTAAGCOCCT CGGTGACTGG 
GCTGCTCGCA CCATGGACCC AGAGAGCAGT ATCTTTATTG AGGATGCCAT TAAGTATTTC 

AAGGAAAAAG TGAGCACACA GAATCTGCTA CTCCTGCTGA CTGATAATGA GGCCTGGAAC 240 

GGATTCXrrGG CTGCTGCTGA ACTGCCCAGG AATGAGGCAG AT6AGCTCCG TAAAGCTCTG 300 

GACAACCTTG CAAGACAAAT GATCATGAAA GACAAAAACT GGCACGATAA AGGCCAGCAG 360 

TACAGAAACT GGTTTCTGAA AGACTTTCCT CGGTTGAAAA GTGAGCTTGA GGATAACATA 420 

AGAAGGCTCC GTGCCCTTGC AGATGGGGTT CAGAAGGTCC ACAAAGGCAC CACCATCGCC 480 

AATGIGGTGT CTGGCTCTCT CAGCATTTCC TCTGGCATCC TGACCCTCGT CGG CATGG GT 540 

CTGGCACCCT TCACACAGGG AGGCAGCCTT GTACTCTTGG AACCTGGGAT GGAGTTGGGA 600 

ATCACAGCCG CTTTGACCGG GATTACCAGC AGTACCATGG ACTAC3GGAAA GAAGTGGTGG 660 

ACACAAGCCC AAGCCCACGA CCTGGTCATC AAAAGCCTTG ACAAATTGAA GGAGGTGAGG 720 

GAGTTTTTCG GTGAGAACAT ATCCAACTTT CTTTCCTTAG CTGGCAATAC TTACCAACTC 780 

ACAOGAGGCA TTGGGAAGGA CATCCGTGCC CTCAGACGAG CCAGAGCCAA TCTTCAGTCA 840 

GTACOSCATG CCTCAGCCTC AOGCCCCCGG GTCACTGAGC CAATCTCAGC TGAAAGCGGT 900 

GAACAGGTGG AGAGGGTTAA tXSAACCCAGC ATCCTG6AAA TGAGCAGAGG AGTCA AGCTC 960 

AOGGATGTGG CCCCTGTAAG CTTCTTTCTT GTGCTGGATG TAGTCTACCT CGTGTACGAA 1020 

TCAAAGCACT TACATGAGGG GGCAAAGTCA GAGACAGCTG AGGAGCTGAA GAAGGTGGCT 1080 

CAGGAGCTGG AGGAGAAGCT AAACATTCTC AACAATAATT ATAAGATTCT GCAGGOQGAC 1140 
CAAGAACTGT GA 

Seq ZD NO: 475 Protein sequence 
Protein Accession 8: IIP_003652.1 

1 11 21 31 41 51 

MSALFLGVGV RAEEAGARVQ QNVPSGTDTG DPQSKPLGDW AAGTMDPESS IFIEDAIKYF 60 

KBKVSTQMUi LLLTDNEAWN OFVAAAELPR HEADBLRXAI. DNLARQMIMK OKNWHDKGQQ 120 

YRNWFLKEFP RLKSELEDNI RRLRALADGV QKVHKGTTIA NWSGSLSIS SGILTLV04G 180 

IAPFTEGGSL VLLEPGMELG ITAALTGITS STMDYGKKWW TQAQAHDLVI KSLDKLKEVR 240 

EFLGENISNF LSLAGNTYQL TRGIGKDIRA LRRARANLQS VPHASASRPR VTEPISAESG 300 

EOVERVNBPS Il>EMSRGVKL TDVAPVSFFL VLDWYLVYE SKHLHEGAKS BTAEELKKVA 360 
QELECKUfIL MNNYKILQAD QEL 

Seq ID NO: 476 DNA sequence 

Nucleic Acid Accession 8; NM_014452.1 

coding sequence: 1..1968 

1 11 21 31 41 SI 



60 
120 
180 
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wo 02/086443 , | ] 1 

ATCC3QC5ACCr CTCCGAGCAG CAGCACCGCC CTOGCCTCCT GCAGCCGCA7 OGCCCGCCGA 60 

GCCACAGCCA CGATGATOGC GmiC fCCClT CTCCTGCTTG GATTTCCTTAG CACCACCACA 120 

GCTCAGCCAG AACAGAAGGC CTOGAATCTC ATTCGCACAT ACCGCCATGT TGACCGTGOC 180 
ACCGGCCAGG TGCTAACCTO TGACAAGTGT CCAGCAGGAA CCTA TGTC TC TGAGGATTGT' 240 

ACCAACACAA GCCTGCGCGT CTGCAGCAGT TGCCCTGTGG GGACCTTTAC CAGGCATGAG 300 
AAXGGCATAG AGAAATGCCA TGACTGTACT CAGCCATGCC CATGGCCAAT GATTGAGAAA •«?360 

TTACCTTCTG CIGCCTTGAC TGACCGAGAA TGCACTTGCC CACCTGGCAT GTTCCAGTCT 420 

AACGCTACCr GTGCCCCCCA TACGGTGTGT CCIGTCGGTT GGGGTGTGCG GAACAAAGGG 480 

ACAGAGACTG AGGATGTCOG GTGTAAGCAG TGTGCTOGGG GTACXTTCTC AGATGTGCCT 540 

TCTAGTGTGA TGAAATGCAA AGCATACACA GACtGlCTGA GTCAGAACCT GGTGGIGATC 60O 

AAGCCGGGGA CCAAGGAGAC ACACAACGTC TGTGGCACAC TCCCGTCCTT CTCCRGCTCC 660 

ACCrCACCTT CCXCTGGCAC AGCCATCTTT CXaOGCCCTG AGCACATGGA AACCCATGAA 720 

GTCCCTTCCT CCACTTATGT TCCCAAAGGC ATGAACTCAA CAGAATCCAA CTCTTCTGCC 700 

TCTGTTAGAC CAAAGGTACT 6AGTAGCATC CAGGAAGGGA CAGTCCCTGA CAACACAAGC 840 

TCAGCAAGGG GGAAGGAACA OGTGAACAAG ACOCTCCCAA ACCTTCAGGT AGTCAACCAC 900 

CAGCAAGGCC CCCACCACAG ACACATCCTG AAGCTGCTGC OGTCCATGGA GCCCACTGGG 960 

GGCGAGAAGT CCAGCACGCC CATCAAGGGC CCCAAGAGGG GACATCCTAG ACAGAACCXA 1020 

CACAAGCATT TTGACATCAA TGAGCATTTG CCCTGGATGA TTGTGCTTTT OCTGCTGCTG 1080 

GtGCTTGTGG TCATTGTGGT GTGCAGTATC CGGAAAAGCT CGAGGACTCT GAAAAAfiGGG 1140 

C0CXX3GCAG0 ATCCCAGTGC CATTGTGGAA AAGGCAGGGC TGAAGAAATC CATGACTCCA 1200 

ACCCAGAACC GGGAGAAATG GATCTACTAC TGCAATGGCC ATGGTATCGA TATCCTGAAG 1260 

CTTGTAGCAG CCCAACTGGG AAGCCAGTGG AAAGATATCT ATCAGTTTCT TTGCAATGCC 1320 

AGTGAGAGGG AGGTTGCIGC TTPCTCCAAT GGGTACACAG COSACCAOGA GOGGGCCTAC 1380 

GCAGCTCTGC AGCACTGGAC CATCOGGGGC CCCGAGGCCA GCXTTOGCCCA GCTAATTAGC 1440 

GCCCTCCGCC AGCACCGGAG AAAOGATGTT GTGGAGAAGA TTCGTGGGCT GATGGAAGAC 1500 

ACCACCCAGC TGGAAACTGA CAAACTAGCT CTCCCGATGA GOCCCAGCCC GCTTAGCCCG 1560 

AGCCCCATCC CCAGCCCCAA CX3CX3AAACTT GAGAATTCOG CTCTCCrGAC GGTGGAGCCT 1620 

TOCOCACAGG ACAAGAACAA GGGCTTCTTC GTGGATGAGT OGGAGCCCCT TCTCCGCTGT 1680 

GACTCTACAT CCAGCGGCTC CTCCGCGCTQ AGCAGGAACX3 GTTCCTTTAT TACCAAAGAA 1740 

AAGAAGGACA CAGTGTTGCG GCAGGTACGC CTGGACCCCT GTGACTTGCA GCCTATCTTT 1800 

GATGACATGC TCX:ACTTTCT AAATCCTGAG GAGCTGCGGG TGATTGAAGA GATTCCCCAO 1860 

GCTGAQGACA AACTAGACCG GCTATTOGAA ATTATTGGAG TCAAGAGCCA GGAAGCCAGC 1920 
CAGACCCTCC TGGACTCTGT TTATAGCCAT CTTCCTGACX: TGCTGTAG 



5eq ID HO: 477 Protein sequence 
Protein Accession ft: NP_0S5267.1 

1 11 21 31 41 51 

MGTSPSSSTA LscSRIARR ATATMIAGSL LLLGFLSTTT AQPEQKASNL IGTYRHVDRA 60 

TGQVLTCDKC PAGTYVSEHC TNTSLRVCSS CPVCTPTRHE NGIEKCHDCS QPCPWPMIEK 120 

LPCAALTDRE CTCPPGMFQS NATCAPHTVC PVGHGVRKKG TETEDVRCKQ CARGTFSDVP 180 

SSVMKCKAYT DCLSQNLWI KPGTKETDNV CGTLPSPSSS TSPSPGTAIP PHPEHMBTHE 240 

VPSSTYVPKG MNSTESNSSA SVRPKVIiSSI QEGTVPDKTS SARGKEDVNK TLPHLQWliH 300 

QQGPHHRHIL KLLPSMEATG GEKSSTPIKG PKRGHPRQNL HKHFDINEHL PWMIVLFLLL 360 

VLWIWCSI RKSSRTLKKG PRQDPSAIVE KAGLKKSMTP TQNREKWIYY CNGHGIDILK 420 

LVAAQVGSQW KDIYQFLCKA SEREVAAFSN GYTADHERAY AAIiQHWTIRG PEASLAQLIS 400 

ALRQHRRNDV VEKIRGLMED TTQLBTPKLA LPKSPSPLSP SPIPSPNAKL ENSALLTVEP 540 

SPQDKNKGFP VDESBPLLRC DSTSSGSSAL SRMGSPITKB KKDTVLROVR MPCDLQPIF 600 
DDMbHFLNPE EUlVIEEIPQ AEDKLDBLFE IIGVKSQEAS QTLLDSVYSH LPDLL 



Seq ID KO: 478 DKA sequence 
Nucleic Acid Accession ft: XM_044533 
Coding sequence: 238.. 2751 

1 11 21 31 ' 41 51 

GCTCTGCCCA AGCCGAGGCT GCGGGGCCX3G CGCCGGCGGG AGGACTGOGG TGCCCXGCGG 60 

AGGGGCTGAG TTTGCCAGGG CCCACTTCAC CCTGTTTCCC ACCTCCCGCC CCCCAGGTCC 120 

GGAGGCX3GGG GCCCCCGGGG CGACTCGGGG GCGGACCGCG GGGCGGAGCT GCCGCCCGTG 180 

AGTCCGGCOS AGCCACCTGA GCCCGAGCCXS CGGGACACOG TCGCTCCTGC TCTCOGAATG 240 

CTGCGCACCG CGATGGGCCT GAGGAGCTGG CTOGCOGOCC CATGGGGCGC GCTGCOGCXTT 300 

CGGCCACCGC TGCTGCTGCT CCTGCTGCTG CIGCTCCTGC TGCAGCOGCC GCCTCCGAOC 360 

TGGGCGCTCA GCCCCCGGAT CAGCCTGCCT CTGGGCrCTG AAGAGCGGCC ATTCCTCAGA 420 

TTCX5AAGCTG AACACATCTC CAACTACACA GCCCTTCTGC TGAGCAGGGA TGGCAGGACC 480 

CTGTACGTGG GTGCTCGAGA GGCCCTCTTT GCACTCAGTA GCAACCTCAG CTTCCTGCCA 540 

GGOQGGGAGT ACCAGGAOCT GCTTTGGGGT GCAGA06CAG AGAAGAAACA GCAGTGCAGC 600 

TTCAAGGGCA AGGACCCACA GCGCGACTGT CAAAACTACA TCAAGATCCT CCTGCCGCTC 660 

AGOSGCAGTC ACCTGTTCAC CTGTGGCACA GCAGCCTTCA GCCCCATGTG TACCTACATC 720 

AACATGGAGA ACTTCACCCT GGCAAGGGAC GAGAAGGGGA ATCTCXTTCCT GGAAGATGGC 780 

AAGGGCCGTT GTCCCTTCGA CCCGAATTTC AAGTCCACTG CCCTGGTGGT TGATGGCGAG 840 

CTCTACACTG GAACAGTCAG CAGCTTCCAA GGGAATGACC CGGCCATCTC GCX3GAGCCAA 900 

AGCCTTOGCC CCACCAAGAC CGAGAGCTCC CTCAACTGGC TGCAAGACCC AGCTTTTGTG 960 

GCCTCAGCCT ACATTCCTGA GAGCCTGGGC AGCTTGCAAG GCGATGATGA CAAGATCTAC 1020 

TTTTTCTTCA GCGAGACTGG CCAGGAATTT GAGTTCTTTG AGAACACCAT TOrGTCCCGC 1080 

ATTGCCCGCA TCTGCAAGGG CGATGAGGGT GGAGAGCGGG TGCTACAGCA GOGCTGGACC 1140 

TCCTTCCTCA AGGCCCAGCT GCTGTGCTCA CGGCCCGACG ATGGCTTCCC C TTCAA OGTG 1200 

CTGC3VGGATG TCTTCACGCT GAGCCCCAGC CCCCAGGACT GGCGTGACAC (XTTTTCTAT 1260 

GGGGTCTTCA CTTCCCAGTG GCACAGGGGA ACTACAGAAG GCTCTGCGGT CTGTGTCTTC 1320 

ACAAT6AAGQ ATGTGCAGAG AGTCTTCAGC GGCCTCTACA AGGAGGTGAA CX33TGAGACA 1380 

CAGCAGTGGT ACACOGTGAC CCACCCGGTG CCCACACCCC GQCCTGGAGC GTGCATCACC 1440 

AACAGTGCCC GGGAAAGGAA GATCAACTCA TCCCTGCAGC TCCCAGACCG OGTGCTGAAC IS 00 

TTCCTCAAGG ACCACTTCCT GATGGACGGG CAGGTC06AA GCXXX3VTGCT GCTGCTGCAG 1560 

OCCCAGGCTC GCTACCAGOG OGTGGCTGTA CACCGCGTCC CTGGCCTGCA CCACACCTAC 1620 

GAtGTCCTCT TCCTGGGCRC TGGT6ACBGC CGGCTCCACA AGGCAGTGAG OGTGGGCCCC 1680 

CGGGIGCACA TCATTGAGGA GCTGCAGATC TTCTCATCGG GACAGCCOGT GCAGAATCTG 1740 
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wo 02/086443 

CrCCroCSRCA CCCACAGGGG CSCTGCTGTAT GOGGOCTCAC ACTCGGGOST AGTCCAGGTG 1800 

ScS^ ACTGCAGCCT CTACAGGAGC TOTGC3GGACT GCCTCCTOSC CCGGGACCCC 1B60 

SSS^ GGACOSGCrC CAGCTGCWUS 1^20 

^SSct ^tSSgA CATCGAGCSGA GCCAGO^ 1^80 

TCGGTTCTCT CXXCGTCTTT TCTACCAACA GGGGAGAAGC CATGTGAGCA AGTCCACTTC 2040 

l^^jS. SgTGAACAC TTTGGCCrcC CCGCTCCTCT CCAACCTGGC GACCCGRCTC 2X00 

T^ACGCA ACGGGGCXCC CGTCAATGCC TCGGOCTCCT GCCAOGTGCT ACCCACTGGG 2160 

^TcSgC AGCTGGTAGC CAGCTACroC CCAGAGGTGG TCGAGGACGG GGTGGCAGAC 2280 

icSrScAG TGTACCOJTC ATTAT^ 

GCTOrreGCA AGGCCAGCTG GGGTGCAGAC AGGTCCTACT GGAAOGftGTT CCTGGTCATG 2400 

TT^SgGC CGTGCIGCTC CCAG™ 24€0 

SSSI SSot^CT GAAGCAGGGG GAAtGT^ 2520 

^GTOGTGC T^CCCCTOA GACCCGCCCA CTCAACGGCC TAGGGCCCCC TAGCACXCCG 2580 

SSI?^ ^A^S GTCCCTGTCA GAO^GCC^ 2640 

GaScaGAGA aSgGCCACT CAGCATCCAA GACRGCTTCG TGG ftfiGTATC CCCAGTGTGC 2700 

^^^^^^ ^tSgOT tScTCGGAG ATCCGTGACT C 1 V113GTG TO AGAGCTGACT 2760 

TC^^ ?S^SctTC GTGGAACAOG ACCGT^ 2"° 
SSSJS StGCTGCTC TCCAGTCAAG TAGCGAAGCT CCTACCACCC AGACACCCAA 2940 

tg^ttot g^act^ cccrrrtrm aaaaaacaat tccaaatctg aaactagaat 3060 
gcatgcagca cacacggctg 3120 

^^roCTGG GGATCCATCC AAAGTOGTTG TCTGAGACRG ACrrOGAAAC OCraCCAAC 31.80 

S^^S^ aScScCT cagggac^ 

^S^SS TCGGACCCAA CTCCTGGWK TTT^ 3420 

^^SSg cSgAGCTCA GGAGft^ ^460 

S^iAGAG ACTGTCGCCr GCCTTCrrCC GT^ 3540 

^^SwTC CACCCrCGCr CCATCTTTGA ACTCAAACAC GAGGAACTAA CTGC^CCTG 3600 

G^CTCOC C»GTCCCCAG TTCACCCTCC ATCCCTCACC TTCCTCCACT CTAAGGGATA 3660 

C^SCAGG GGCOnxyU^T TTATG^^ 3720 
ATGCACTTTA TGTCATTTTT TAATAAAG1CC TGAAGAATTA CTGTTT 

Seq ID HO: 479 Protein sequence 
Protein Accession S: XP_044533.3 

] Y r r r r 
^i^^ 

SFKGKDPQRD CQNYIKILLP LSGSHLFTCG TAAPSPMCnf INMENPTLAR DE^NVLLED 
6KGRCPFDPN FKSTALWDG EI^VTGTVSSP QGNDPAISRS QSLRPTKTES SI^^D^AF 
VASAYIPSSL GSLQGDDDKI YFFFSETGQE FEFPENTIVS RIARICKGDE GGSRVLQQRW 
TSFLKAQIiLC SRPDDGFPFN VLQDVFTLSP SPQDWRDTLP YGVFTSQWHR GTTEGSAVCV 

^?Srvf sglykbvnre twwytvthp vptprpgaci 'htsarerkin sslqlpdr^ 

NFLKDHFU1D GQVRSRKLLL QPQARYQRVA VHRVPGI«HT YDVI^PLGTGD GRI^KAVSVG 
PRVHIIEELQ IPSSGQPVQN LLLDTHRGLL YAASHSGWQ VPMANCSLYR SC^CLL^ 
PYCAWSGSSC KKVSLYQPQL ATRPWIQDIE GASAKDLCSA SSWSPSFVP TGOTCTQVQ 
FOPNTVNTLA CPLLSNIATR LWLRNGAPVN ASASCHVLPT GDLLLVGTQQ LGEFQOJ^E 
BGTOQLVASY CPBWBDGVA DQTDEGGSVP VIISTSRVSA PAGGKASWGA DRSYWKEPLV 
MCTLPVIAVL LPVLFLLYRH RNSMKVFLKQ GECASVHPKT CPWLPPBTR PLNGLGPPST 

Sdhrgyqsl sdsppgsrvf tesekrplsi qdsfvevspv cprprvrlgs eirdsw 

Seq ID NOt 480 DHA sequence 
Nucleic Acid Accession #: NM_004217.1 
Coding sequence : S8 . . 1092 

1 11 21 31 41 51 

ggccgggaga Lagcagtgc cttggacccc agctctcctc cccctttctc tctaaggatg 
gcccagaagg agaactccta cccctggccc tacggccgac agacggctcc atctggcctg 

AGCACCCTCC CCCAGCGAGT CCrcCGGAAA GAGCCTGTCA CCCCATCTGC ACTTGTCCTC 180 

^TGAGOGCT ccaatgtcca ccocacagct gcccctggcc agaaggtgat ggagaatagc 
actcggacac cogacatctt aaogcggcac ttcacaattg atgactttga gattgggcgt 
cctcxgggca aaggcaagtt tggaaacgtg tacttggctc gggagaagaa aagccatttc 

ATOGIGGCGC TCAAGGTCCT CTTCAAGTCC CAGATAGAGA AGGAGGGCGT GGAGCATCAG 
CTGCGCAGAG AGATCGAAAT CCAGGCCCAC ctgcaccatc ccaacatoct gcgtctctac 

AACTATTrrr ATGACCGGAG GAGGATCTAC TTGATTCTAG AGTATGCCOC CCX30GGGGAC 540 

A^CTGCAGAA GAGCTCCACA TTTGACGAGC AGOGAACAGC CACGATCATC 600 

GAGGAGTTGG CAGATGCTCT AATGTACTGC CATGGGAAGA AGGTGATTCA CAGAGACATA 660 

AAGOCAGAAA ATCTCCTCTT AGGGCTCAAG QGAGAGCTGA ACATTGCTGA CTTCGGCTCG 720 

TCTGTGCATG CGCCCTCCCT GAGGAGGAAG ACAATGTGTG GCACCCTGGA CTACCTGCCC 780 

CCAGAGATGA TTGAGGGGOG CATGCACAAT GAGAAGGTGG ATClGTGGTG CA™?*?^^ J*? 

CTTTCCTATG AGCTCCTGGT GGGGAACCCA CCCTTTGAGA GTGCATCACA CAACGAGACC 900 

TATOGCOGCA TOGTCAAGGT GGACCTAAAG TTCCCOGCTT CTGTGCCCAC GGGAGCCCAG 960 

GACCTCATCT CCAAACTGCT CAGGCATAAC CCCTCX>3AAC GGCTGCCCCT GGCCCAGGTC 1020 

TCAGCOCACC CTTGGGTCCG GGCCAACTCT CGGAGGGTGC TGCCTCCCTC TGCCCTTCAA 10 80 

TCTGTCGCCT GATGGTCCCT GTCATTCACT CGGGTGCGTG TGTTTGTATG TCTGTGTATG 1140 

TATAGGGGAA AGAAGGGATC CCTAACTGTT COCTTATCTG TTTTCTACCT CCTCCTTTGT 1200 
TTAATAAAGG CTGAAGCTTT TTGT 

Seq ID NO: 481 Protein sequence 
Protein Accession ft: NP_004208 

1 11 21 31 41 SI 
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I I 

KAQKQISYPW PYGRQTAPSG 
SSGTPDILTR HFTIDDPEIG 
QLRSBIEIQA HLKHPNIIAIi 
KEELADAI>IY O^GKKVIHRO 
PPaUEGSMH NEKVDIjWCIG 
QOLISKLLRH NPSERXiFXiAQ 



PCTAJS02/12476 



i i ) I 

LSTLPQRVLR KEPVTPSALV LXSHSNVQPT AAP GQ1CVM 3I 
RPLGKGKFGN VYIAREKKSH FZVALKVLFK SQX5KEGVEH 
YNYFVIHaiRI VLILSYAPRG ELYKELQKSC TF DgQgT ATI 

IKPEKXiLLGIj kgelkiadfg wsvhapsiar kthog tlp yl 
vlcyellvgn pppesashne tyrrivkvdl kppasvptca 
vsaepwv8an ssrvxippsal qsva 



Seq ID NO; 482 DMA sequence 
Kucleic Acid Accession i: AK0S5fi63 
coding sequence: 3 8.. 1423 



1 
I 

AGAAOGGCTT 
AAAACCACAA 
GGGAAGGTCC 
GCTTATGTGG 
TTTTGATCTT 
TAGCCCTGTC 
AGTCTTGGCA 
ACAGCCCGAG 
CCTGTTCAOG 
TAOGAGCTGG 
CGGACTTAGC 
AGCATTTGCT 
CACTGCCTCT 
GTACAC5TGGG 
ACTCATCAGA 
GACCCTAGGT 
TGAACAAATG 
TGTTCAAATT 
CAATGTCCTA 
TGATTTGAAC 
ATTTAACACT 
TTATGGTTTT 
TGGAGTTCCA 
TAGATATGGA 
TTATAAGGAA 
TGTTTAATCA 
TATGAAACTA 
GCTTTAAATA 
GTTTTCTAGT 
AGATCTGTCA 
TGCCACTGTG 
CTTAGTTTTT 
ATGCAGTGGC 
TGCCTCAGCC 
TCTATTTTTA 
ACCTCATGAT 
CCTGGCCGAT 
GGGAAAGGGA 
AATTGCTAAA 
TTTTTAGCAG 
GATTTTTGTT 



11 

I 

CXXXK33GGAG 
AGATCCTTTT 
TGGAAGATAC 
TGCAGTTCTA 
TTTAGTTTAA 
TATTCATTTG 
CAGTTGGGAG 
ATACACAOGG 

ATGcrrrcTA 

CTTCAAGAGC 
AGTATCTTCC 
CTTTCTATTA 
GCTATAGCTA 
AAAGTCTTAC 
GAGGTATCTA 
TTTGGCTCAT 
GTTCTTGCTC 
TTCAAGGATG 
AACTTTTCAG 
CCAGTTACAT 
CCTGGGAAAA 
GGTCTCAATC 
GGAATTGGAG 
ACTAATAATA 
TATTGACTCC 
TTTACTCTAA 
TATTTTTGTA 
GGCTTCCTTT 
TGACTGCAGT 
CATTACTAAG 
CCCGGCCAAT 
GTTTTGTTTT 
ATGATCTCAG 
TCCCGAGTAG 
GTAAAGACGG 
CCACCCACCT 
ATTTTCTTTA 
AAAATGTCT6 
TTTTTCTTTG 
AAATTTTGGA 
AAAGTTTCTC 



21 
I 

CTGTGCAGCT 
TTGGCAAGTT 
TGCTCTTTGG 
CTAATAGTAT 
TGACATGTTT 
GGTTTGAAAG 
CTCTCTTTAT 
GAAGATTATT 
TTCGGAATAA 
ATGTTGCAGA 
TTCCCOGAAT 
CATATATGCT 
TTGCCTPGAT 
TCCAGACAAC 
CCTTAGATGG 
TGGCTGGATC 
ATGTGACCAA 
ACrGGATTAG 
ATCATCAOGT 
CAACTCCAGC 
ATGTGAACCC 
ATGGACACAC 
CAACTCAAGG 
GAATTGGACA 
TTGGCTTCCA 
ATGTTAGATA 
AAATGTATTT 
AGAAAATGTG 
GTGATGTGAC 
ATACGATATT 
ACATTATTAT 
GTTTTTTGAG 
CTCACTGCAA 
CTGGGATTAC 
GGGATTTCAC 
TAGCCTCCCA 
ATGAAATTTA 
TTCAAAAAGT 
AGGTTCTCCT 
ATACATTCTA 
TCCTTTAAAA 



31 
I 

CCTTATCATG 
GTTACGGGAA 
TGTAATAAAC 
AGCTTTAACT 
AATAAGTTAC 
ATTAGAAGTC 
ATTAAAAGAA 
AGTTGGTACT 
ACCTTTTGCT 
TCTTAGTCGA 
GAATCCATTT 
CATTGAAATT 
GACATTTGGC 
ACCACCCCAT 
AGTTTTAGAA 
AGTGCATGTA 
CAGGCTGTAC 
GCCTGCCTTA 
AATOCCAATG 
TAAACCTAGT 
AGTTATTCTT 
ACCTTACAGC 
ATTGAGGACT 
ACCAAGACCA 
ATTTATTTAG 
ATAGTAGTCT 
GTGACAGTGA 
TTTCTTTAAA 
CTTACCTTTA 
TCTTTTTTTT 
TAACTTAAGG 
ATGGAGTCTC 
CCTCTGCCTC 
AGGCACCTQC 
CATGTTGGCC 
AAGTGCTGGG 
TAAATATGCT 
AAAGGTCTCT 
GAATTATGTC 
TCTAGCACAA 
ATTTTAGTAC 



41 

1 

GGGACA ATTC 
TTTAGACTT6 
TTGATATGTA 
GCCTATACTT 
TGGGTAACAT 
CTGGCTGTAT 
AGTGGAGAAC 
TTTGTGGCTC 
TATGTCTCAG 
AGCTTGTGTG 
GTTTTGATTG 
AATAATTATT 
ACTATGTATC 
GTTATTGGTC 
GTOOGAAATG 
AGAATTOGAC 
ACTCTAGTGT 
TTGTCTGGGC 
CCTCTTTTAA 
AGTCCACCTC 
CTAAACACAC 
AGCATGCTTA 
GGTTTTACAA 
TGATAGACTC 
TAATCCAACT 
TGTTCACATT 
AATCCTCGTA 
TTTGGATTTT 
TAAGAGCCAC 
TC0GAGACX3G 
CTGTACnTA 
ACTCTGTCGC 
CTGAGTTCAA 
CACCAOGCCC 
AGGCTGGTCT 
ATTAGGTGTG 
TCTTGAATAA 
TTTATAGCTT 
TTACAAACTA 
TTTGAATTTT 
ATTTGTAAAT 



51 
I 

ATCTCTTTCG 
TAGCAGCTGA 
CTGGCTTCCT 
ACCTGACCAT 
TGAGGAAACC 
TTGCCTCCAC 
GCTTTTTGGA 
TTTGTTTCAA 
AAGCTGCTAG 
GAATTATTCC 
ATCTTGCTGG 
TTGCCGTAGA 
CCATGAGTGT 
AGT TGGACAA 
AACATTTTTG 
GAGATGCCAA 
CTACTCTAAC 
CTGTTGCAGC 
AGGGTACTGA 
CAGAATTTTC 
AAACAAGGCC 
ATCAAGGACT 
ATATACCAAG 
TAACTTATTT 
TTGCATTGAC 
TCATGAAACC 
AATGTTAAAG 
OGTATCTTTG 
TTGATGGAGT 
AGTCTTGCTC 
TTAAGGCTTC 
CCAGGCTGGA 
ATGATTCTCC 
AGCTAATTTT 
TGAACTCCTG 
AGCCACGGCA 
TACACATTTT 
TTCCAAACTT 
AAAGCAAAAA 
TAATTATCAA 



Seq ID HO: 483 Protein sequence 
Protein Accession #: BAB709aa.l 



1 
I 

MGTIHIiFRKP 
TAYTYLTIFD 
ESAERFLEQP 
RSLOGIIPGL 
GTMYPMSVYS 
VRIRSOANBQ 
MPIiLKGTDDL 
SSMLNQGLGV 



11 
I 

QRSFFGKLLR 
LFSLMTCLIS 
EIHTGRLLVG 
SSIFkPRMNP 
GKVLLOTTPP 
MVLAHVTNRL 
NPVTSTPAKP 
PGIGATQGLR 



21 
I 

EFRLVAADRR 
YWVTLRKPSP 
TPVALCFNLP 
FVLIDLAGAP 
HVIGQLDKLI 
YTLVSTLTVQ 
SSPPPEPSFN 
TGFTNIPSRY 



31 
1 

SWKILLFGVI 
VYSFGFERLE 
TMLSIRKKPF 
ALCITYKLIB 
REVSTLDGVL 
IFKDDHIRPA 
TPGKNVNPVI 
GTMNRIGQPR 



41 
I 

NLICTGFLLM 
VLAVFASTVL 
AYVSEAASTS 
INNYFAVDTA 
EVRKEHFWTL 
LLSGPVAANV 
LtNTQTRPYG 
P 



Seq ID NO: 484 DNA sequence 

Nucleic Acid Accession Ji-. FGEKESH predicted 

Coding sequence: 1..900 



ATGCCGCCGC 
COOCGGCGGC 
GCOGTGGGCA 
CGGCCCACTG 
GGCTGCGGCG 
GGACCCCGGG 
CTTCCTAACT 
COGGTGCGCA 
CTTTGCTACC 
TTTCAAAACA 
GTGCTGCTGG 



11 
I 

GGGAGCTGAG 
GTAGCGCGCC 
AGAGCAGCCT 
CGCTGGACAC 
GGGCTGTGCA 
GAGGAGACTG 
CAGGCTCTCC 
TTGAGCTCTG 
OGGATACOSA 
TCACAGAGAA 
TGGGCACCCA 



21 

1 

CGAGGCCGAG 
CCCAGA6CTG 
CATCGTCAGC 
CTTCTCTGGT 
CCGGGGAGCr 
GAGCAGGGCC 
CCGCCCCGCC 
GGACACAGCG 
TGTCTTCCTG 
ATGGCTGCCC 
GGCCGACCTG 



31 
I 

CCGCCCCCGC 
GGCATCAAGT 
TACACCTGCA 
ACGTAGGTTC 
GGGGCGGGGG 
GGAGGTGGCG 
CCTGCAGTGC 
GGACAGGAGG 
GCGTGCTTCA 
GAQATCOGCA 
AGGGAOGATG 



41 
! 

TCCGGGCCCC 
GCGTGCTGGT 
ATGGGTACCC 
AATC6CCCGT 
TCTGGGGGG6 
CTGGTGCGGC 
AAGTCCTGGT 
ATTTTGACCG 
GOGTGGTGCA 
CGCACAACCC 
TCAACGTACT 



60 
120 
180 
240 
300 



60 
120 
180 
240 
300 
360 
420 
430 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1360 
1440 
1500 
1S60 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 



51 

1 

MCSSmSIAL 
AQLGALFILK 
WLQEHVADLS 
SAIAIAIiMTP 
GFGSLAGSVH 
LHPSDHKVIP 
FGLNHOETpy 



51 
1 

GACCCCTCCC 
GGGCGAOGGC 
CGCGCGCTAC 
GCG6CCGCGT 
AGGGCGCAGA 
CCAGGAOGCT 
GGATGGAGCT 
ACTTCGTTCC 
GCCCftGCTCC 
CCAGGCGCCT 
AATTCAGCTG 



60 
120 
180 
240 
300 
360 
420 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 



371 
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GACCAGGGGG GCCGGGAGGG CC O XTGCCC CAACCCCAGG CTCACGGTCT GGCOGAGAAG 720 

ATCCGAGCCT GCTCCTACCT TC5AGTCCTCA GCCTTGACGC AGAACAACTT GAAQGAAGTA 780 

TTTGACTCGG CIATTCTCRG TGOa^TTGAG CACAAAGCOC aSCTGGAGAA GAAACTGAA T 840 
GCCAAAGGTG TCCGCACCCT CTCCOGCTGC OGCTGGAAGA AGTTCTTCTG CTTCGTTTGA 

Seq ID NO: 485 Protein sequence 
Protein Accession 6: FGENBSH predicted 

1 11 21 31 41 51 

Lprelseae ppplraptpp prrrsappel GIKCVLVGDG AVGKSSLIVS YTCKGYPARY 60 

RPTALOTFSG TYVQSPVRPR GCGGAVHRGA GAGVSAQGRR GPRGGDWSRP RGGAGAAQDA 120 

LPNSGSPRPA PAVQVLVDGA PVRIELWDTA GQEDFDRLRS LCYPDTDVFL ACFSVVQPSS 180 

FQNITEKMLP EIRTHNPQAP VLLVGTQADL RDDVNfVLIQL DQGGREGPVP QPQAQGLAEK 240 
IRACCYLECS ALTQKNLKEV FDSAILSAIE HKARLEKKLN AKGVRTLSRC RWKKPFCFV 

Seq ID NO: 466 DNA sequence 

nucleic Acid Accea&ion #i XM_,063832.2 

Coding sequence: 1..711 

1 11 21 31 41 SI 

ATGCCGCCGC GGGAGCTGAG CGAGGCCGAG CCGCCCCCGC TCOTGGCCCC GACCCCTCCC 60 

CCGCGGOGGC GTAGCGCGCC CCCAGAGCTG GGCATCAACT GOGTGCTQGr GGGCGAOGGC 120 

GCCGTCGGCA AGAGCAGCCT CATCGTCAGC TACACCTGCA ATGGGTAOCC OGCGCGCTAC 180 
OGGCCCACTG CGCTGGACAC CTTCTCTGTG CAAGTCCTGG TGGATGGAGC TCCGGTGCGC 
ATTGAGCTCr GGGACACAGC GGGACAGGAG GATTTTGACC GACTTCGTTC CCTTTGCTAC 
CCGGATACCG ATGTCTTCCT GGCGTGCTTC AGCGTGGTGC AGCCCAGCTC CTTTCAAAAC 

ATCACAGAGA AATGGCTGCC CGAGATCCGC ACGCACAACC CCCAGGOSCC TGTGCTGCTG 420 

GTGGGCACXX: AGGCC6ACCT GAGGGACGAT 6TCAA0GTAC TAATTCAGCT GGACCW3GGG 480 

GGCCX3GGAGG GCCCCGTGCC CCAACCCCAG GCTCAGGGTC TGGCCGAGAA GATCCGAGCC 540 

TGCTGCTACC TTGAGTGCTC AGCCTTGACG CAGAAGAACT TGAAGGAAGT ATTTGACTCX5 600 

GCTATTCTCA GTGCCATTGA GCACAAAGCC CGGCTGGAGA AGAAACTGAA TGCCAAAGGT 660 
GIGCXSCAOCC TCTCCOGCTG COGCT6GAAG AAGTTCTTCT GCTTCGTTTG A 

Seq ID HO: 487 Protein sequence 
Protein Accession ft: XP_063832.1 

1 11 21 31 41 51 

MPPRELSEAE PPPLRAPTPP PRRRSAPPEL GIKCVLVGDG AVGKSSLIVS YTCNGYPARY 60 

RPTALDTFSV QVLVOGAPVR lELMDTAGQE DFDRLRSLOf PDTDVPLACP SWQPSSFQN 120 

ITEKWLPEIR THNPQAPVLL VGTQADLRDD VNVLIQLDQG GREGPVPQPQ AQGLABKIRA 180 
CCYLECSALT QKNLKEVFDS AILSAIEHKA RLEaOOiNAKG VRTLSRCRWK KPFCPV 

Seq ID NO: 488 DNA sequence 

Nucleic Acid Accession If: NM_014398.1 

Coding sequence : 64 .. 13 14 

1 11 21 31 41 51 

GGCACCGATT CGGGGCCTGC CCGGACTTCG CCGCACGCTG CAGAACCTCG CCCAGCGCCC 60 

ACCATGCCCC GGCAGCTCAG CGCGGCGGCC GCGCTCTTCG OGTCCCTGGC CGTAATTTTG 120 

CACGATGGCA GTCAAATGAG AGCAAAAGCA TTTCCAGAAA CCAGAGATTA TTCTCAACCT 180 

ACTGCAGCAG CAACAGTACA GGACATAAAA AAACCTGTCC AGCAACCAGC TAAGCAftGCA 240 

CCTCACCAAA CTTTA6CAGC AAGATTCATG GATGGTCATA TCACCTTTCA AACRGCGGCC 300 

ACAGTAAAAA TTCCAACAAC TACCCCAGCA ACTACAAAAA ACACTGCAAC CACCAGCCCA 360 

ATTACCTACA CCCTX3GTCAC AACCCAGGCC ACACCCAACA ACTCACACAC AGCTCCTCCA 420 

GTTACTGAAG TTACAGTCGG CCCTAGCTTA GCCCCTTATT CACTGCCACC CACCATCACC 480 

CCACCAGCTC ATACAGCTGG AACCAGTTCA TCAACOGTCA GCCACACAAC TGGGAACACC 540 

ACTCAACCCA GTAACCAGAC CACCCTTCCA GCAACTTTAT CGATACCACT GCACAAAAGC 600 

ACAACOGGTC AGAAGCCTGA TCAACCCACC CATGCCCCAG GAACAACGGC AGCTGCCCAC 660 

AATACCACCC GCACAGCTGC ACCTGCCTCC ACGGTTCCTG GGCOCACCCT TGCACCTCAG 720 

CCATCGTCAG TCAAGACTCG AATTTATCAG GTTCTAAACG GAAG CftGAC T CTGTATAAAA 780 

GCAGAGATGG GGATACAGCT GATTGTTCAA GACAAGGAGT CGGTTTTTTC ACCTCGGA6A 840 

TACTTCAACA TCGACCCCAA CGCAACGCAA GCCTCTGGGA ACTGTGGCAC CCGAAAATCC 900 

AACCTTCTGT TGAATTTTCA GGGCGGATTT GTGAATCTCA CATTTACCAA GGATGAAGAA 960 

TCATATTATA TCAGTGAAGT GGGAGCCTAT TTGACCGTCT CAGATCCRGA GACAGTTTAC 1020 

CAAGGAATCA AACATGCGGT GGTGATGTTC CAGACAGCAG TCGGGCATTC CTTCAAGTGC 1080 

GTGAGTGAAC AGAGCCTCCA GTTGTCAGCC CACCTGCAGG TGAAAACAAC CGATGTCCAA 1140 

CTTCAAGCCT TTGATTTTGA AGATGACCAC TTTGGAAATG TGGATGAGTG CTCGTCTGAC 1200 

TACACAATTG TGCTTCCTGT GATTGGGGCC ATCGTGGTTG GTCTCTGCCT TATGGGTATG 1260 

GGTGTCTATA AAATCGGCCT AAGGTGTCAA TCATCTGGAT ACCAGAGAAT CTAATTGTTG 1320 

CCCGGGGGGA ATGAAAATAA TGGAATTTAG AGAACTCTTT CATCCCTTCC AGGATGGATG 1380 

TTGGGAAATT CCCTCAGAGT GTCGGTCCTT CAAACAATGT AAACCACCAT CTT CTATTCA 1440 

AATGAAGTGA GTCATGTGTG ATTTAAGTTC AGGCAGCACA TCAATTTCTA AATAwii ix T ISOO 

GTTTATTTTA TGAAAGATAT AGTGAGCTGT TTATTTTCTA GTTTCCTTTA GAATATTTTA 1560 

GCCACTCAAA GTCAACATTT GAGATATGTT GAATTAACAT AATATATGTA AAGTAGAATA 1620 

AGCCTTCAAA TTATAAACCA AGGGTCAATT GTAACTAATA CTACTGTGTG TGCATTGAAG 1680 

ATTTTATTTT ACCCTTGATC TTAACAAAGC CTTTGCTTTG TTATCAAATC GACTTTCAGT 1740 

GCTTTTACTA TCTGTGTTTT ATGGTTTCAT GTAACATACA TATTOCTGGT GTAGCACTTA 180O 

ACTCCTTTTC CACTTTAAAT TTGTTrrCGT TTTTTGAGAC GGAGTTTCAC TCTTGTCACC 1860 

CAGGCTGGAG TACAGTGGCA CGATCTCGGC TTATGGCAAC CTCCGCCTCC CGGGTTCAAG 1920 

TGATTCTCCT GCTTCAGCTT CCCGAGTAGC TGGGATTACA GGCACACACT ACCACGCCTG 1980 

GCTAATTTTT GTATTTTTAT TATAGACGGG TTTCACCATG TTGGCCAGAC TGGTCTTGAA 2040 

CTCTTGACCT CA(3GTOATCC AOCCACCTCA GCCTCCCAAA GTGCTGGGAT TACAGGCATG 2100 

AGCCATTGCG CCCGGCCTTA AATGTTTTTT TTAATCATCA AAAAGAACAA CATATCTCAG 2160 
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GTTGTCTAAG TGTTTTTATG TAAAACCAAC AAAAAGAACA AATCAGCTTA TArTTTTTAT 2220 

CTTGATGACT CCTGCTCCAG AATTGCTAGA CTAAGAATTA GGTGGCTACA GATGGTAGAA 2280 

CtAAACAAlA AGCAAGAGAC AATAATRATG GOOCTTAATT ATTAACAAAG TGCCAGAGTC 2340 

TAiGGCTAAGC ACTTTATCTA TATCTCATTT CATTCTCACA ACTTATAAGT GAAIXSAGTAA 2400 

ACTGAGACTT AAGGGAACTG AATCACTTAA ATGTCACCTG GCTAACTGAT GGCAGAiSCCA 2460 

GAGCTTGAAT TCATGTTGGT CTGACATCAA GGTCrTTCGT CTTCTCCCTA CaCCAAGTTA 2520 

CCTACAAGAA CAATGACACC ACACTCTGCC TGAAGGCTCA CACCTCATAC CAGCATACGC 2S80 

TCACCTTACA GGQAAA1GGG TTTATCCflGG ATCATGACAC ATTAGGGTAC ATGAAAGGAG 2640 

AGCTTTGCAG ATAACAAAAT AGCCTATCCT TAATAAATCC TCCACTCTCT GGAACGAGAC 2700 

TGAGC3GGCTT TGTAAAACAT TAGTCAGTTG CTCATTTTTA TGGGATTGCT TAGCTGGGCT 2760 

GTAAAGATGA AGGCATCAAA TAAACTCAAA GTATTTTTAA ATTTTPTPGA TAATACAGAA 2820 

ACTTCGCTAA CCAACTGTTC TTTCTTGAGT GTATAGCCCX: ATCTTGTGGT AACTTGCTGC 2880 

TTCTGCACTT CATATCCATA TTrCCTATTG TTCACrTTAT TCTGTAGAGC AGCCTGCCAA 2940 

GAATTTTATT TCTGCTGTTT TTTTTGCTGC TAAAGAAAGG AACTAAGTCA GGATGTTAAC 3000 

AGAAAAGTCC ACATAACCCT AGAATTCTTA GTCAAGGAAT AATTCAAGTC AGCCTAGAGA 3060 

CCATGTTGAC TTTCCTCATG TCTTTCCTTA TGACTCAGTA AGTTGGCAAG GT CCTGA CTT 3120 
TAGTCTPAAT AAAACATTGA ATIGTAGTAA AGGTTTTTGC AATAAAAACT TACTTTGG 

Seq ID NO: 489 Protein sequence 
Protein Accession ft*. NP_055213.1 

1 11 21 31 41 SI 

MPRQLSAAAA LFASIAVILH DGSQMRAKAP PBTRDYSQPT AAATVQDIKK PVQQPAKOAP 60 

HQTLAARFMD GHITPQTAAT VKIPTTTPAT TKNTATTSPI TYTLVTTQAT PNNSHTAPPV 120 

TEVTVGPSLA PYSLPPTITP PAHTAGTSSS TVSHTTGNTT QPSNQTTLPA TLSIALHKST 180 

TGQKPDQPTH APGTTAAAHN TTRTAAPAST VPGPTLAPQP SSVKTGIYQV LNGSRLCIKA 240 

- BMGIQLIVQD KESVFSPIWY FNIDPNATQA SGNCXTTiaSN lilJfPQGGF^ 300 

YYISEVGAYL TVSDPETVYQ GIKHAWMFQ TAVGHSPKCV SEQSLQLSAH LQVKTTDVQL 360 
QAPDFEDDHF GNVDECSSDY TIVLPVIGAI WGLCUIGMG VYKIRLRCQS SGYQRI 



Seq ID NO: 490 DNA sequence 

Nucleic Acid Accession fit nm_00S409.3 

Coding sequence: 94.. 3 78 

1 11 21 31 41 51 

TTCCTTTCAT GTTCAGCATT TCTACTCCTT CCAAGAAGAG CAGCAAAGCT GAAGTAGCAG 60 

CAACAGCACC AGCAGCAACA GCAAAAAACA AACATGAGTG TGAAGGGCAT GGCTATAGCC 120 

TTGGCTGTGA TATTGTGTGC TACAGTTGTT CAAGGCTTCC CCATGTTCAA AAGAGGACGC 180 

TCTCTTTGCA TAGGCCCTGG GGTAAAAGCA GTGAAAGTGG CAGATATTGA GAAAGCCTCC 240 

ATAATGTACC CAAGTAACAA CTGTQACAAA ATAGAAGTGA TPATTACCCT GAAAGAAAAT 300 

AAAGGACAAC GATGCCTAAA TCCCAAATCG AAGCAAGCAA GGCTTATAAT CAAAAAAGTT '360 

GAAAGAAAGA ATTTTTAAAA ATATCAAAAC ATATGAAGTC CTGGAAAAGG GCATCTGAAA 420 

AACCTAGAAC AAGTTTAACT GTGACTACTG AAATGACAAG AATTCTACAG TAGGAAACTG 480 

AGACTTTTCT ATGGTTTTGT GACTTTCAAC TTTTGTACAG TTATGTGAAG GATGAAAGGT 540 

QGGTGAAAGG ACCAAAAACA GAAATACAGT CTTCCTGAAT GAATGACAAT CAGAATTCCA 600 

CTGCCCAAAG GAGTCCAGCA ATTAAATGGA TTTCTAGGAA AAGCTACCTT AAGAAAGGCT 660 

GGTTACCATC GGAGTTTACA AAGTGCTTTC AOGTTCTTAC TTGTTGTATT ATACATTCAT 720 

GCATTTCTAG GCTAGAGAAC CTTCTAGATT TGATGCTTAC AACTATTCTG TTGTGACTAT 780 

GAGAACATTT CTGTCTCTAG AAGTTATCTG TCTGTATTGA TCTTTATGCT ATATTACTAT 840 

CTGTGGTTAC AGTGGAGACA TTGACATTAT TACTGGAGTC AAGCCCTTAT AAGTCAAAAG 900 

CATCTATGTG TCGTAAAGCA TTCCTCAAAC ATTTTTTCAT GCAAATACAC ACTTCTTTCC 960 

CCAAATATCA TCTAGCACAT CAATATGTAG GGAAACATTC TTATGCATCA TTTGGTTTGT 1020 

TTTATAACCA ATTCATTAAA TGTAATTCAT AAAATGTACT ATGAAAAAAA TTATACGCTA 1080 

TGGGATACTG GCAACAGTGC ACATATTTCA TAACCAAATT AGCAGCACCXS GTCTTAATTT 1140 

GATGTTTTTC AACTTTTATT CATTGAGATG TTTTGAAGCA ATTAGGATAT GTGTGTTTAC 1200 

TGTACTTTTT GTTTTGATCC GTTTGTATAA ATGATAGCAA TATCTTGGAC ACATTTGAAA 1260 

TACAAAATGT TTTTGTCTAC CAAAGAAAAA TGTTGAAAAA TAAGCAAATG TATACCTAGC 1320 

AATCACTTTT ACTTTTTGTA ATTCTGTCTC TTAGAAAAAT ACATAATCTA ATCAATTTCT 1380 

TTGTTCATGC CTATATACTG TAAAATTTAG GTATACTCAA GACTAGTTTA AAGAATCAAA 1440 
GTCATTTTTT TCTCTAATAA ACTACCACAA CCTTTCTTTT TTAAAAAAAA AAA 



Seq ID NO: 491 Protein sequence 
Protein Accession #: NP_005400.1 



1 11 21 31 41 51 

MSVKGMAIAL AVILCATWQ GPPMPKRGRC LCIGPGVKAV ICVADIBKASI MYPSNNCDKI 60 

EVIITLKENK GQRCIjNPKSK QARIjIIKICVE SKNP 



Seq ID NO: 492 DNA sequence 

Nucleic Acid Accession fi: NM_000577.1 

Godlng sequences 41.. 520 

1 11 21 31 41 SI 

I ! I 1 1 1 

GGCACGAGGG gaagacctcc TGTCCTATCA GGCCCTCCCC ATGGCTTTAG AGACGATCTG 60 

CCGACCCTCT GGGAGAAAAT CCAGCAAGAT GCAAGCCTTC AGAATCTGGG ATGTTAACCA 120 

GAAGACCTTC TATCTGAGGA ACAACCAACT AGTTGCCGGA TACTTGCAAG GACCAAATGT 180 

CAATTTAGAA GAAAAGATAG ATOTGGTACC CATTGAGCCT CATGCTCTGT TCTTGGGAAT .240 

CCAT6GAGGG AAGATGTGCC TGTCCTGTGT CAAGTCTGGT GATGAGACCA GACTCCAGCT 300 

GGAGGCAGTT AACATCACTG ACCTGAGCGA GAACAGAAAG CAGGACAAGC GCTTCGCCTT 360 

CATCCGCTCA GACAGTGGCC CCACCACCAG TTTTGAGTCT GCCGCCTGCC CCGGTTGGTT 420 

CCTCTGCACA GCGATGGAAG CTGACGAGCC CGTCAGCCTC ACCAATATGC CTGACGAAGG 480 

OGTCATGGTC ACCAAATTCT ACTTCCAGGA GGAOSAGTAG TACTGCCCAG GOCTGCCTGT 540 

TCCCATTCTT GCATGGCAAG GACTGCAGGG ACTGCCAGTC CCCCTGCXXC AG(3GCTCC0G 600 
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GCTATGGGGG CACTGACGAC CAGCCATTGA GGGGTGGACC CTCAGAAGGC GTCSVCAACAA 
CJCTGGTCACA GGACTCTGCC TCCTCTTCAA CTGACCAGCC TOCATGCTGC CTCCAGAATG 
GTCTTTCTAA TGTtTTGAATC AGAGCACRGC AGCOCCTGCA CAAAGOCCTT OCATGTOSCC 
TCTGCATTCA GGATCAAACC CCGACCACCT GCCCAACCTG CTCTCCTCTT GCOICTGCC? 
CTTCCTCCCT CATTCCACCT TCCCATGOCC TGGATCCATC ACGCCACTTG ATGACCCCCA 
MXMXTtQGC TCCCACACCC TGTTTTACAA AAAAGAAAAG ACCAG TCCAT GAGGGAC3GTT 
TTTAACGGTT TGTGGAAAAT GAAAATTAGG ATTTCATGAT TTTTTmTT CAGTCCCCGT 
GAAGGAGAGC CCrTCATTTG GAGATTATGT IXri-WaS G G G AGAGGCT GAG GACTTA AAAT 
ATTCCrcCAT TTGTGAAATG ATGGTGAWW; TAAQfPGGTAG CTTTTCCCTT CTTTTTCTTC 
TTrnTTOTG ATOTCCCAAC TTGTAAAAAT TAAAACTTAT GGTACTATGT TAGCCCCATA 
ArmTTTTT TCCTTTTAAA ACACTTCCAT AATCTGGACT CCTCTGTCCA GGCACTGCTG 
CCCAGCCTCC AAGCTCCATC TCCACTCCAG ATTTTrTACA GCTGCCTGCA GTACTTTACC 
TCCTATCAGA AGTTrCTCaG CTCCCAAGGC TCTGAGCAAA TGTGGCTCCT GGGGGTTCTT 

TCrrCCTCTC ctgaaggaat aaattgctcc ttgacattgt agagcttctg gcacttggag 

ACTTGTATGA AAGATGGCTG TGCCTCTGCC TGTCTCCOCC ACCAGGCTGG GAGCTCTGCA 
GAGCAGGAAA CATGACTCGT ATATGTCTCA GGTCCCTGCA GGGCCAAGCA OCTAGCCTCG 
CTCTTGGCAG GTACTCAGOG AATGAATGCT GTATATGTTG GGTCCAAAGT TCCCTACTTC 
CTGTGACTTC AGCTCTGTTT TACAATAAAA TCTTGAAAAT GCCTAAAAAA AAAAAAAAAA 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAA 
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Seq ID HO: 493 Protein sequence 
Protein Accessioa ft: NP_000S€e.l 

1 11 21 31 41 SI 

KALETICRPS GRKSSKMQAP RIWDVNQKTF YLRNNQLVAG YLQGPNVNLE EKIDWPIEP 
HALFLGIHGG KMOiSCVKSG DBTRLQLEAV NITDLSENRK QDKRFAPIRS DSGPTTSFES 
AACPGWFLCT AMEADQPVSL TKMPDEGVMV TKFYFQSDE 

Seq lO HO: 494 DNA sequence 

Nucleic Acid Accession Ui NM_002081.l 

Coding sequence: 222.. 1898 



GGCTGCCXX3A 
GGCTTTTGTT 
CGGGACCTTG 
AGAGGCGCGG 
GCTGGTGGCT 
GCAAGAGCCG 
GCGACGTGCC 
CCTGCTGCAC 
CCGCGCTCCG 
TCGATGACCA 
CCGGC6CCTT 
AGCTGCGCCT 
GGGCCCGCCT 
ACTACCTGGA 
GAGAGCTGCG 
TGGGCX3TGGC 
CXSAGAGCTGT 
CCTGCCCTGA 
ACGCCGAGTG 
CATCX3GGTGT 
CCCTCCAGGA 
AGGTCAACCC 
GGGAGAGGCC 
GCGACGTCCA 
TGAGCACTGC 
AGG7CATGGG 
CCAAGCOGGA 
TGCGCAGCGC 
GCTCGGGCAG 
GCTCCAGCTC 
AGAAGACCTC 
TCCTGGCCCT 
GAGGCCAAGG 
TGGAGAGGCX: 
GTCCCAGCCC 
CAGGTCAGCT 
TCCGGCTGCC 
C7ACAGAGGA 
CGCCTCCTCC 
CCTCCAGAGA 
TCTQAGATGA 
CTGCX3CCCTT 
GGAGTCTGAG 
GGGGCCACTG 
G6AGGCAGCG 
GCCTTGCTGG 
CCTGCTCCCA 
CAGGGCTCAG 
CCCCGCACTG 
GCAC6GGGAC 
TGTCCTTGTT 
CAOCTTGGAC 



11 
I 

GCGAGCGTTC 
GTCTCCGCCT 

gctctgccct 
gcgggtggcc 
gctatgtgcx; 
gagctgcx»3c 
ccaggcggag 
C3«x:gagatg 
ggacagcagc 

CTTCCAGCAC 
GGGA6A6CTG 
GTACTACCGC 
GCTCGAGCGC 
CTGCCTGGGC 
CCTGCGGCCC 
CAGCGACGTG 
CATGAA6CTG 
CTATTGCCGA 
GAGGAACCrC 
GGAGAGTGTC 
CAACAGGGAC 
CCAGGGCCCT 
ACCTTCAGGC 
GGACrrCTGG 
CAGTGATGAC 
TGAOGGCCTG 
CATGACCATC 
CTACAACGGC 
CGGT6ATG6C 
CGGGAGGGCC 
GGCTGGCAGC 
TACAG TAGCC 
ACTGACTTTG 
TGGGGTGGGA 
CAGGCCTGGC 
GGGAGCCAGT 
TAGCCCTCCC 
GGCCTCAAAG 
CACTGG6ACT 
AGCCCCGCAC 
TGCATGATGC 
GAG6GGCCCC 
CACTGTCCTC 
ACCCACCTGC 
TGGGCTCTGC 
GGTCCAGGGC 
TCCTCACCCA 
AGTGACCCTC 
CACAOSGGAA 
CIGGATAGTT 
CATGGAGAGC 
OCTGGTGACC 



21 
I 

GGACCTCGCA 
CCTCGGCOGC 
TCGCGGGCGG 
GGGGGGGCCX3 
GCCGCAGC3GC 
GAGGTCCGCC 
ATCTCX5GGTG 
GAGGAGAAOC 
CGCGTCCTGC 
CTGCTGAACG 
TACAC6CA6A 
GGTGCCAACC 
CTCTTCAAGC 
AAGCAGGCOG 
ACCCGTGCCT 
GTCCGGAAAG 
GTCTACTGTG 
AATGTGCTCA 
CTGGACTCXa 
ATC6GCAG0G 
ACGCTCAOGG 
GGGCCTGAGG 
ACGCTGGAGA 
ATCAGCCTCC 
CGCTGCTGGA 
GCCAACCAGA 
CGGCAGCAGA 
AACGACXrrGG 
TGTCTGGATG 
TTGACCCATG 
TGCCCCCAGC 
AGGCCCOGGT 
CCAAAAATAC 
CAGGGAGGGC 
CTC3GCCTGCC 
GTGCCCAAAA 
CCCAGCTCCC 
CAACCOGCTG 
CCCAGCAGAG 
GGGCTGTCTG 
CCTCCCCTCA 
AGOGTCTGCA 
CCACAGACCC 
GCTTCTGCTG 
CAATGTGGGC 
TGTTGGAGGA 
GATCAGGAAC 
GGCTGTCAOC 
TGCCTAGGTC 
AAGGGCTTTT 
TGTTGGCTCC 
TCCTGTCACT 



31 

! 

CCCCGCGCGC 
CGCCGCCTCT 
GAACTGCGCA 
CCGGCCCOGC 
TGGTCGCCTG 
AGATCTAOGG 
AGCACCXGOG 
TGGCCAACCG 
AGGCCATGCT 
ACTCGGAGOG 
ACGCGAGGGC 
TGCACCTGGA 
AGCTGCACCC 
AGGOGCIGOG 
TCGTGGCTGC 
TGGCrCAGGT 
CTCACTGCCT 
AGGGCTGCCT 
TGGTGCTCAT 
TGCACA{3GTG 
CCAAGGTCAT 
AGAAGCGGCG 
AGCTGGTCTC 
CAGGGACACT 
AOGGGATGGC 
TCAACAACCC 
TCATGCAGCT 
ACTTCCAGGA 
ACCTCTGCGG 
CCCTCCCAGG 
CCCCGACCTT 
GGOGGTAACT 
AACACAGA06 
OGGCGGCTCT 
TTTCTGCCTT 
GCCATGTATT 
TGCACCX3CCG 
GAGCCCACAG 
OCCACCAGCC 
GGTGTCOGCC 
GCGCAGGCTG 
GGGTGACGCC 
TGCAGTGAGG 
GAGGAGGGGA 
TGCCCCTOGC 
COCCX»GGGC 
CAGGGCCTCC 
TGCTCACAGG 
CCTTCCCGAC 
CCAAACATGC 
TCCCAGATGG 
CACTGAGGCC 



41 
I 

CCCGOGCCGC 
GGACCGCGAG 
GGACCCGGCC 
CATGGAGCTC 
CGCCCGCGGG 
AGCCAAGGGC 
GATCTGTCCC 
CAGCCATGCC 
TGCCACCCAG 
GACGCTGCAC 
CTTCCGGGAC 
GGAGAC6CTG 
CCAGCTGCTG 
GCCCTTCGGG 
TCGCTCCTTT 
CCCCCTGGGC 
GGGAGTCCCC 
TGCCAACCAG 
CACCGACAAG 
GCTGGCGGAG 
CCAGGGCPGC 
CCGGGGCAAG 
TGAAGCCAAG 
GTGCAGTGAG 
CAGAGGCCGG 
CGAGGTGGAG 
GAAGATCATG 
CGCCAGTGAC 
CCXX»AGGTC 
CCTGTCAGAG 
CCTCCTGCCC 
GCCCCAAGGC 
ATATTTAATT 
GA6CAGGGGC 
TTAATTTTGT 
TCAGGGACCT 
CAGAAGCAGC 
OGAGCCTQTG 
AGCCCTGGCC 
ATCCAGGGTC 
CAGAGGCCGG 
TGAGACAGCA 
GGCCCTCCAT 
AGCTGGGCCC 
ACACAGGGCT 
TGAGGAGCAG 
CTGTTCACGG 
GATGCTGGTG 
CCACCCAGCT 
ATCCATTTAC 
CTTGQ6AGGC 
ATCAGGGOCC 



51 
! 

CGCCGCCGCC 
CCGCGCGCGC 
AGGATCCGAG 
CGGGCCCGAG 
GACCCGGCCA 
TTCAGCCTGA 
CAGGGCTACA 
GAGCTGGAGA 
CTGCGCAGCT 
GCCACCTTCC 
CTGTACTCAG 
GCCGAGTTCT 
CTGCCTGATG 
GAGGCCCCGA 
GTGCAGGGCC 
CCGGAGTGCT 
GGCGCCAGGC 
GCCGACCTGG 
TTCTGGGGTA 
GCCATCAACG 
GGGAACCOCA 
CTGGCCCCGC 
GCCCAGCrCC 
AAGATGGCCC 
TACCTCCCCG 
GTGGACATCA 
ACCAACCGGC 
GACGGCAGG6 
AGCAGGAAGA 
CAGGAAGGAC 
CTCCTCCTCT 
CCCAGGGACA 
CACCTCAGCC 
AGGOGCAGAG 
ATGAGGTCCT 
CAGGGGCACC 
CCCTCGAGGC 
CCTTCCTCCC 
CACCCCCCAG 
TGGCAGAGCC 
CCCCACCTCC 
CCACTGCTGA 
GCGCAGATGA 
AAAGGCCCAG 
CACAGGGCAG 
CCAG6ACO0G 
TGACACAGGT 
GCTGGTGAGA 
GCACTGCAGG 
TGACACTTCC 
CGGGAGGGCC 
TGCCCCAGGC 
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120 
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360 
420 
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540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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1560 
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c^2^3^^^m(X TcciOTG^ nil 

TGTCGTGTTG GGAAGGGGTC CTGCMXXSCC AGGAGGACTT GGAGGGTCTS GGGGCTGCTG 3240 

TCCTOAAOCG ACTOACCCTG AGGAGGCOCC TT^^ ^300 

OGCACAGTGG ACCeAGGTCC COSGTT G CTG GTCAGGTCCC GATGGCrTGT TCrCTGGAAC 3360 

CTCACTTTAG ATCTTTTCtK ATCAGGAGCC 3420 

ScrSLr (XXAGGGTCG GCIGGGGACT ^480 

CAGCACTCCC GCTGCACACA GACGGCCTAG GGGTGGGGCT CaCACOCCAC CCTAOGCTCA 3540 

TCTCTGGAAG GGGCMSCCCT GAGTCGTCAC TGGTCftGGGC AGTGGCCAAG OTPGCTGTG? 3600 

CCTTCCTCCA CAAGGTCCCC CCACCGCTCA GTGTCAGOGG GTGAOGTGTG TTCTTTTGAG 3660 
TCCTTGTATG AATAAAAGGC TGGAAACCTA AA 

Seq ID NO; 49S Protein sec^ence 
Protein Accession S: NP_002072.1 

] V r r- r f 

MBLRAHGMWL LCARAALVAC ARCSDPASKSR SOGEVRQIYG AKGPSbSDVP QAEISGEHLR 
ICPQGYTCCT SEMEENLANR SHAELBTALR DSSRVLQAML ATQLRSFDDH EQHLLNOSra 

TLQATFPGAF GELYTQNARA PRDLYSELRI. YYRGANLHLE ETLASPWMU, ^^^F^OT 180 

QLLLPDDYLD CLGKQAEALR PFGEAPRELR LRATRAFVAA RSPVQGLGVA SDVVRTOaOy 240 

PU5PECSRAV MKLVYCAHCL GVPGARPCPD YCRNVLKGCL ANQADLDAEW RNLLDSMVLI 300 

TDKFWGTSGV ESVIGSVHTW LAEAIKALQD NRDTLTWCVI QGOaiPKVNP QGPGPEEKRR 360 

RGKLAPRERP PSGTLBKLVS EAKAQUIDVQ DPWISLPGTL CSEKMALSTA SDDRCWNGMA 420 

RGRYLPEVKG OGLANQINOT BVEVDITKPD MTIRQQIKQL KIMINRLHSA YNGNDWDFQD 480 
ASDOGSGSGS 

Seq ID NO: 496 DNA sequence _ ... 

Nucleic Acid Accession NM_001650.2 
Coding sequence: 40.1011 

1 11 21 31 41 51 

GGGGCAGGCA ATGAGAGCTG CACTCTGGCT GGGGAAGGCA TGAGTGACAG ACCCACAGCA 60 

•AGGCXK5TGGG GTAAGTGTGG ACCTTTGTGT ACCAGAGAGA ACATCATGGT GGCTTTCAAA 120 

GGGGTCTGGA CTCAAGCTTT CTGGAAAGCA GTCACAGCGG AATTTCTGGC CATGCTTATT 180 

TTTGTTCTCC TCAGCCTGGG ATCCACCATC AACTGGGGTG GAACAGAAAA GCCTTTACCG 240 

GTCGACATGG TTCTCATCTC CCrPTGCTTT GGACTCftGCA TTCCAACCAT GGTGCAGTCC 300 

TTTGGCCATA TCAGCGGTGG CCACATCAAC CCTGCAGTGA CTGTOSCCAT GGTGTGCACX: 360 

AGGAAGATCA GCATCGCCAA GTCTGTCTTC TACATCX3CAG CCCAGTIGOCT GGGGGCCATC 420 

ATTGGAGCAG GAATCCTCTA TCTGGTCACA CCTCCCAGTG TGGTGGGAGG CCTCGGAGTC 480 

ACCATGGTTC ATGGAAATCT TACOGCTGGT CATGGTCTCC TGGTTGAGTT GATAATCACA 540 

TTTCAATTGG TGTTTACTAT CTTTCCCAGC TGTGATTCX:A AACGGACTGA TGTCACTGGC 600 

TCAATAGCTT TAGCAATTGG ATTTTCTGTT GCAATTGGAC ATTTATTTGC AATCAATTAT 660 

ACTGGTGCCA GCATCAATCC CGCCCGATCC TTTGGACCTG CAGTTATCAT GGGAAATTGG 720 

GAAAACCATT GGATATATTO GGTTGGGCCC ATCRTAGGRG CTGTOCTOGC TGCSTGGCCTT 780 

TATGAGTATG TCTTCTGTCC AGATGTTGAA TTCAAAOGTC GTTTTAAAGA ACCCTTCAGC 840 

AAAGCTGCCC AGCAAACAAA AGGAAGCTAC ATGGAGCTGG AGGACAACAG GAGTCAGGTA 900 

GAGACGGATG ACCTGATTCT AAAACCTGGA GTGGTGCATG TGATTGACXST TGACCGGGGA 960 

GAGGAGAAGA AQGGGAAAGA CCAATCTGGA GAGGTATTGT CTTCAGTATG ACTAGAAGAT 1020 

CGCACTGAAA GCAGACAAGA CTCCTTAGAA CTGTOCTCAG ATTTOCTTCC ACCCATTAAG 1080 

6AAACAGATT TGTTATAAAT TAGAAATGTG CAGGTTTGTT GTTTCATCTC ATATTACTCA 1140 

GTCTAAACAA TAAATATTTC ATAATTTACA AAGGAGGAAC GGAAGAAACC TATT GTGAAT 1200 

TCCAAATCTA AAAAAAGAAA TATTTTTAAG ATGTTCTTAA GCAA ATATA T ACCTATTTTA 1260 

TCIAGTTACC TTTCATTAAC AACCAATTTT AACCGTGTGT CARGATTTGG TTAAGTCTTG 1320 

OCTGACAGAA CTCftAAGACA CX3TCTATCAG CTTATTCCTT CTCTACTGGA ATATTGGTAT 1380 
AGTCAATTCT TATTTGAATA TTTATTCTAT TAAACTGAGT TTAACAATGG C 

Seq ID NO: 497 Protein sequence 
Protein Accession #: NP__001641.1 

1 11 21 31 41 51 

MSDRPTARRW GKCGPLCTRB HIHVAFKGVM TQAFWKAVTA EFLAMLIFVL LSIX3STINWG 60 

GTEKPLPVDM VLISLCFOLS XATMVQCPGH ISGGHINPAV TVAMVCTRKI SIAKSVFYIA 120 

AQCLGAIIGA GILYLVTPPS WGGDGVTMV HGNLTACHGL liVELIITFQL VFTIPASCDS 180 

KRTDVTGSIA LAIGFSVAIG HLFAINYTGA SMNPARSPGP AVIMGNWEHH HIYWVGPIIG 240 

AVLAGGLYBY VPCPDVBFKR RFKEAPSKAA QQTKGSYMSV EDNRSQVBTD DIiILKPGWH 300 
VZDVDRGEEK KGKDQSGEVL SSV 

Seq ID NO: 498 DNA sequence 

Nucleic Acid Accession ft: AB020684.1 

Coding sequence : 1 . . 1744 



X 11 21 31 41 51 

CUCCCnGTC ATTAATACAT TAAAAAGATT CAATCTTTAC CCIGAGGTAA TTTrGGCCAG 
TTGGTACGQG ATTTATAOCA AAATAATGGA CTTGATTGGT ATTCAAACCA AGATATGTTG 

GACGGTTACC AGAGGAGAAG GACTCAGTCC TATTGAAAGC TGTGAAGGAT TGGGAGATCC 180 

TGCTTGCTTT TATCTTGCTG TAATTTTTAT TTTAAATGGA CTAATGATGG CATTATTCTT 240 

CATATATGGC ACATATTTAA GTGGCAGCCG ATTAGGAGGC CTXSGTTACAG TGTTGTGCTT 300 

CTTTTTCAAT CATGGAGAGT GTACCCGTGT AATGTGGACA CCACCTCTCC GTGAAAGCTT 360 

CTCATATCCA TTTCTTGTTC TTCAGATGTT GCTAGTGACT CATATTCTCA GGGCTACAAA 420 

ACTTTATAGA GGAAGCTTGA TTGCACTCTG CATTTCCRAT C?rATTTTTCA TCCTTCCTTG 480 

GCAGTTTGCT CAGTTTGTAC TTCTTACTCA GATTGCATCA TEATTTGCAG TATATGTTGT 540 

CGGGTACATT GATATATGTA AATTACGGAA GATCATTTAT ATACACATGA TTTCTCTTGC 600 

ACrTTGTTTT GTTTTGATGT TTOGGRACTC AATCTTATTA ACTTCTTATT ATGCTTCTTC 660 

TnajTAATT ATTTGGGGTA TTCTGGCAAT GAAACX»CAT TTCCTGAAAA TAAATGTATC 720 



60 
120 
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TCAACTTAGT TTATGQGrCA TTCAAGGATG TTTTTGGTTA TTTGGAACTG TCATACTTAA 780 

ATACTTGACA TCTAAAATTT TTGGTATTGC AGATGACCCT CATAT TGGCA ACXTACTAAC 840 

ATCAAAATTX: TTTACTTATA AGGATTTTGA TACTTTATTG TATACCTGTG OGOOGACTT 900 

TGACTTTATC GAAAAAGAGA CTCCACTGAG ATACACAAAG ACATTATTGC TTCCAGTTGT 960 

TCTTGTAGTG TTTGTTCCTA TTGTTAGAAA GATtATTACT GATATGTGGG GTGTCTTAGC 1020 

TAAAC3UVCAG ACACATGTAA GAAAACACCA GTTTOATCAT GGAGAGCTGG TTTACCATGC 1080 

ATTGCAATTG TTAGCATATA CAGGCCTTOG TATTTTAATT ATGAGACTAA tACt i.-rim 1140 

GACACCACAC ATGTOTGTTA TGGCATCACT GATCTGCTCA AGACAGCTAT TTGGATGGCT 1200 

CTTTTGCAAA GTACATCCT6 GTGCrATTCT GTTTGCTATA TTACCAGCAA TCTCA ATACA 1260 

AGGTTCAGCA AATCTGCAAA CCCAGTGGAA TATTGTAGGG GAGTTCAGCA ATTT6CCCCA 1320 

AGAAGAACTT ATAGAATGGA TCAAATATAG TACTAAACCA GATGCAGTGT TTGCGGGTGC 1380 

CATCCCCACG ATCGCAAGTC TTAAGCTCTC TGCACTTCGG CCCATTGTGA ATCATCCACA 1440 

TTATGAAGAC GCAGGCTTCA GAGCCAGAAC AAAAATAGTA TACTCAATGT ATAGTCGGAA ISOO 

AGCAGCCCAA GAAGTCAAGC GAGAACTGAT AAACTTAAAA GTGAACTATT ACATTCTAGA 1S60 

AGAGTCATGG TCtCTAAGAA GATCCAACCC TGGTTGCW3T ATGCCTGAAA TTTGGGATGT 1620 

AGAAGATCCT GCCAATGCTG GGAAAACTCC CTTATGTAAC CrCTTGGTGA AGGATTCCAA 1680 

ACCTCACTTC ACCACTGIAT TCCAGAACAG TGTTTACAAA GTCCTAGAAG TTGTAA AAGA 1740 

ATGACroCTA CATGACCTGC TGCCTACGGA GAACTACATC TGTAATGGTT TTAATGTTTT 1800 

GCTAAGTCAT GTGTTGTTCA TATCCCAAAA ACTTTTATAC GTAACTGTTT TCA AATAGA A 1860 

AACGTTTrAT TTGGTCAATT TGAATGTCAT TCTAATTATA AAAATGACTT ACACCTTTAT 1920 

CAATTGGTTA CTATTTCAAT GCRCCCTTTA AAATTTGCTA TGCAAATGAG TATATGCTTG 1980 

TACTTGACTT TAATATTTGT 6CTAAAGTGA GCAAAGCTAC CTGTATAAAG AAAACACAGT 2040 

GGGTTGTGAC AAGGATGACA TGAAAATACA GGACAATTCT GACAATGTAG GGGCTGATTT 2100 

TATAGTGTAA GAACTATTAA TGCCCCTTGC TTCTTTTTTC TGCCTCTTGC TCTTGrcrrP 2160 

TGGACATTTC ACTGATTGTA AGTTCTTCGG TCATGTCAGC CCCTGTCATC AACTTGAGTT 2220 

ACAGTAGATG GGGCftGACAT GGAGTGrrTG CTATATAAAA CTATCTGTTT GTTTTACTTC 2280 

CntSTGCGCT TTTTGTTCTC TGTTCTCTTG TTAATGAAGC .TrTTCCTGCC. CATTATTAAT 2340 

CCAAACTCTT GGACCTTGTC GTTAGGAAAT TCCCTTAACT TCX^GCCATA TGGCATTATC 2400 

GTGTCTCrrr CTCrcrCTCr CTTCCTCTCT CTCTTCTCCT CTTCCCCATA TTTTCTGTCA 2460 

AATAAGTACT GTTTACTCAT TTAGTTGCTT ATCAAGTACT TATTCTTGGT TTTAAAAAAA 2520 

ATTAATGGTA ACTCTATTTT TCTCATTTTT AGCATTATTC AAATGTTTAT ATTTTAATAC 2580 

CTTTAAACCA CrTTAAAGTT TTTTCATGTT TAATTATAGT TTTAAGAAAA ACTATTTTGA 2640 

ACAACCCCAA ATATAGTGCA TCTAGAAACT AATGTATATT TGATTAGACA TCATTTATAG 2700 

TGGAACAGTA GACTGTAGTA CATGGTAATT TTTCTTTTAC TATTA AGATA CAATAAAACA 2760 

TGACTAATTT TGCTGTCAAA AATGTAAAGA ATAATGATAA ATGGAGTTTT TTATATTTTA 2820 

CTTTTAAGAT TGCCTGTCTT TAATAAGACA AAGCCTTAAG CCTTATGTTA TAATTTTGGT 2880 

TCTAAAAACC ATCATTTCAG TATAAGGAAT AAGTATATTT CGTCCTCCTC TTTAGTTTTT 2940 

TTCTTCCTAT TTATTTTTAT TTTGAAAAAT TTCTACACXTT TCTTTGAATT CCTTGTATGA 3000 

ATTTTTGTTT CTTAGAAGTT AATTTGTGTG AAATGAGATT CTTCAAAACG ATGAAACCTC 3060 

ATAGCTCTGA GAAAAGGTTT TAGGGmTA AATTCTAAGC AAAGCGTGAC TATGGCTGAC 3120 

AGACTACACA TTTAATTATA CAGCTTCTCT TTCTTAACCA CAGGCAGATT AACCTCATT6 3180 

TGGATTGTCC TTCAGACCTT AGTCCTCAGG CATGGTTTCT GGTGOOCACT CCTGGAAGCC 3240 

GCTGTTCCCT TTCTACCTTC TTACCAGAGC CCAAGGGCAG GCCTGGTCCC GG6GAAGCAG 3300 

CAGCTTGCTG ACATAAGTCA GCTGCAAAGG CTGAGGAGTG TGCCCTCAGA GAAGCACCGC 3360 

CCCCJCAGTCT TGTGCCAGCG CCTAGAGCCG CAGCTCCCAG GGATGCTCCT TCCCTGGAGG 3420 

• CAGCCCAGGA GAGGGACTCT G6CAGCGTTC TTCAGATTTG TGGCCACTGT TTCTCATTTG 3480 

CTGGTTGACT GTTTTTATTT CTTAGGCTTT TGCTAGTTTT AGAAAATAGG GAAGCAGCCC 3540 

TTGATTTGTG GATTAAAAGC AACATTTGAG OGATOATGCA CAACaGTCCA GGAAAATGGG 3600 

CGGTGGACAC TTGAGGCTGA GGATGGGAGT TGACATGAGC AGGGAGAGGG ACGTGCGCGC 3660 

TGCTTATCTG TGATTGTTGC TCACCTGAGT GTGGCTGATT GTGTACATCC AGCAGTTACA 3720 

ATTTTTAAAA ATTATACTTT TACATTTATT TTATATTTTT CTCACCCCCA GTAATTTCCT 3780 

TCCAAAGAAG TTCACATGTA ATAAGTAGAA ATTCTGTATA GGAAAAAAGC ATTAAAAATA 3840 

CTATTATAAC TGCTTCATTT GCTGGGAAOC ATTAAAAGTA ATATAAATTA GCTTTTTCCA 3900 

GAAGGATCCr TTTGTAGCAG TGTTTAT6AA TGTAACCCCC AGCAAAATAT GGCTATATAT 3960 

TAGGGGAGCC AGTTTGGAGC AGAGGCCTGA AGGTCCCTGC TATGCAGCCG TGGCX»CAGC 4020 

TCGCAGCCCA AGCACTGTGG AGCATCCACA CCTTTGATGG CAATGCAGAT TGGTAGCAGG 4080 

TTCCATAGGC GTACAAAACA GTATTAAAGC TCAGTGTTTT GCATATTGTT AGCATTTACA 4140 

AATATTTTTG CTTTAGTATG AGGAAAGTAA GGATGGGCAA AGAAGCGATC AAAATAGCTA 4200 

TTGCTACAAC ATTTTCGAAA ACAAAGTTGG GGCTGTATTT CTTTAAAAAG ATAAGCCTCT 4260 

AAAAATGCTT GGCAAAAAAA ATATAGTGTT AAAATAGGCX: AgTGA TATTA ATGAGAAAAT 4320 

GAAAGTATGT ATCAGGAATA AAGTGATATT GCATAGGAGT ATTGTATTTT TATGAATTTT 4380 
ATGCCAGTTG TTTACATGTA CTATATATGT TAAATTAAAA AAAATCATGA GAAATG 

Seq ID NO: 499 Ptrotein sequence 
Protein Accession 8: BAA74900.1 

1 11 21 31 41 51 

PLVINTLKRF NLYPEVILAS WYRIYTKIMD LIGIQTKICW TVTRGEGLSP lESCEGLGDP 60 

ACPYVAVIFI LNGLMMALFF lYGTYltSGSR LGGLVTVLCP FPNHGECTRV MMTPPLRESP 120 

SYPFLVLQML LVTHILRATK LYRGSLIALC ISNVFFMLPW QFAQFVLLTQ lASLPAVYW 180 

GYIDICKLRK IIYIHMISLA LCFVLMFGNS MLLTSYYASS LVIIWGILAM KPHFLKINVS 240 

ELSLWVIQGC PWLFGTVILK YLTSKIFGIA DDAHIQILLT SKFPSYKDFD TLLYTCAAEF 300 

DFMEKETPLR YTKTbLLPW LWPVAIVRK IISDMWGVLA KQQTHVRKHQ FDHGELVYHA 3 60 

LQLIiAYTALG ILIMRUCLFL TPHMCVMASL ICSRQLPGWL PCKVHPGAIV FAILAAMSIQ 420 

GSANLQTQMN IVGEPSNLPQ EELIEWIKYS TKPDAVPAGA MPTMASVKLS ALRPIVNHPH 480 

YEDAGLRART KIVYSMYSRK AAEEVKRELI KIiKVNYYII*B ESWCVRRSKP GCSMPEIWDV 540 
BDPANAGKTP IiCMLLVKDSK PHFTTVFQNS VYKVLEWKE 

Seq ID NO: SOO DNA sequence 

Nucleic Acid Accession ft: NM_001276.1 

Coding sequence; 127- .1278 

1 11 21 31 41 51 

AGTGGAGTGG GACAGGTATA TAAAGGAAGT ACAGGGCCTG GGGAAGAGGC CCTGTCTAGG 60 
TAGCTGGCAC CAGGAGCX3GT GGGCAAGGGA AGAGGCXACA CCCTGCTCTG CTCTGCTGCA 120 
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GCX3VGAATGG GTGTGAAQGC GTCTCAAACA GGCTTTGTGG TCXrTGGTGCT GCTCCAGTGC 180 

TGCTCTCCftT ACAAACTGGT CTGCTACTAC ACCAGCTQGT CCCAGXAOOG GGAAOGOGAT 240 

GGGAGCTGCT TCOCAGA3GC OCTTGAOCGC TTOCTCTGTA CCCACATCAT CTACAGCITT 300 

GCCAATATAA GCAACGATCA CATCGACAOC TGGGAGTGGA ATGATGTGAC GCTCTACGGC 360 

ATCCTCAACA CACTCAAGAA CAGGAACOCX: AACCTGAAGA CTCTCTTGTC TGTOGtSAGGA 420 

TXSGAACTTTG GGTCTCAAAG ATTTTCCAAG ATAGCCTCCA ACACCCAGAG TOG COSGA CT 480 

TTCATCAAGT CAGTACCGCC ATTCCTGOGC ACCCATGGCT TTGATGGGCT GGACCTTGCC 540 

TGGCTCTACC CTGGAOGGAfi AGACAAACAG CATTTTACCA CCCTAATCAA GGAAATGAAG 600 

GCCGAATTTA TAAAGGAAGC CCAGCCAGGG AAAAAGCAGC TCCTGCTCAG OGCAGCACTG 660 

TCroOGGGGA AGGTCACCAT TCACAGCAGC TATGACAtTG CCAAGATATC CCAACACCTG 720 

GATTTCATTA GCATCATGAC CTAOGATTTT CATGGAGCCT GGCGTGGGAC CACAGGCCAT 780 

CACAGTCCCC TGTTCCGAGG TCAGGAGGAT GCAAGTCCTG ACAGATTCAG CAACACTGAC 840 

lATGCTGTCG GGTACATGTT GAGGCTGGGG GCTCXriGCCA GTAAGCTGGT GATGGGCATC 900 

CCCACCTTCX; GGAG6AGCTT CACTCTGGCT TCTTCTGAGA CTGGTGTTGG AGCCCCAATC 960 

TCAGGACCGG GAATTCCAGG CCGGTTCACC AAGGAGGCAG GGACCCTTGC CTACTATGAG 1020 

ATCTGTCACT TCCTCCXSOSG ACCCACAGTC-CATAGAACCC TCGGCCRGCA GGTCCOCTAT 1080 

GCCACXIAAGG GCAACCAGTG GGTAGGATAC GACGACCAG6 AAAGCGTCAA AAGCRAGGTG 114.0 

CAGTACCTGA AGGATAGGCA GCTGGCAGGC GCX3VTGGTAT GGGCCCTGGA CCTGGATGAC 1200 

TTCCAGGGCr CCTTCTGCGG CCAGGATCTG CGCTTCCCTC TCACCAATGC CATCAAGGAT 1260 

GCACTCGCTC CAACGTAGCC CTCTGTTCTG CACACAGCAC GGGGGCCAAG GATGCCCCGT 1320 

CCCCCTCTGG CTCCAGCTGG COGGGAGCCT GATCACCTGC CCTGCTGAGT CCCAGGCTGA 1380 

GCCTCAGTCT CC C m X TV G OGGCCTATGC AGAOGTCCAC AACACACAGA TTTGAGCTCA 1440 

GCCCTOGTGG GCAGAGAGGT AGGGATGGGG CTGTOGGGAT ACTGAGGCAT CSKaATGTAA ' ISOO 

GACTOGGGAT TAGTACACAC TTGTTGATGA TTAATGGAAA TGTTTACAGA TCOOCAAGGC 1560 

TGGCAAGGGA ATTTCTTCAA CTCCCTGCCC CCTAGCCCTC CTTATCAAAG GACACCATTT 1620 

TQGCAAGCTC TATCACCAAG GAGCCAAACA TCCTACAAGA CACAGTGACC ATACTAATTA 1680 

TACCCCCIGC AAAGCCAGCT TGAAACCTTC ACTTAGGAAC GTAATCX3TGT CCCCTATCCT 1740 

ACTTCCCCTT CCTAATTCCA CAGCTGCTCA ATAAAGTACA AGAGTTTAAC AGTGTGTTGG 1800 

CGCTTTCCTT TGGTCTATCT TTGAGCGCOC ACTAGACCCA CTGGACTCAC CTCCOCCATC 1860 

TCTTCTGGGT TCCTTCCTCT GAGCCTTGGG ACCCCTGAOC TTGCAOAGAT GAAGGCOGOC 1920 
ATGTT 



Seq ID KO: 501 Protein sequence 
Protein Accession 8: NP_001267.l 



1 11 21 31 41 51 

i I i i i I 

KGVKASQTGF WLVLLQCCS MKLVCYYTS WSQYREGDGS CPPDALDRFL CTHIIYSFAM SO 

ISKDKIiyrHE WMDVTLYGML NTLKKIttlPNL KTIiLSVGGWM PGSQRFSKIA SNTQSRRTPI 120 

KSVPPPUmi 6FDGLDLAWL YPGRRDKQKF TTLIKEMKAE FIKEAQPGKK QLLLSAALSA 180 

GKVTIDSSYD lAKISQHLDF ISIMTYDFHG AWRGTTGHHS PLFRGQBDAS PDRFSNTDYA 240 

VGYMLRLGAP ASKLVKGIPT FGRSFTLASS ETGVGAPISG PGIPGRFTKB AGTLAYYBIC 300 

DPLRGATVHR TLGQQVPYAT KGNQWVGYDD QBSVKSKVQY LKDRQIiAGAM VWALDLDDFQ 360 
GSFCGQDLRP PLTNAIKDAL AAT 



Seq 10 NO: S02 DNA sequence 

Kucleic Acid Accession #: IIM_006474.1 

Coding sequence: 161.. 669 

1 11 21 31 41 51 

I ) I i I i 

GCTGCCTAGG GTCTGGAAAG CTCGGGCACC CTCCCTCTCC GGGGCTCCTG C TCCCA CCCC 60 

TCCGGOCCCC CCACCGTCGC GCTCCTCCAG GCTGGGCCTG TGGCCGCGGT GCTTTTAATT 120 

TTCCCCCAGC TCAGAATCTT GCTGCTOGGC CCCCAGGAGA GCAACAACTC AACGGGAACX5 180 

ATGTGGAAGG TGTCAOCTCT GCTCTTOGTT TTGGGAAGOG OGTCGC TCTG GGTCCTGGCA 240 

GAAGGAGGCA GCACAGGCCA GCCAGAAGAT GACACTGAGA CTACAGGTTT GGAAGGGQGC 300 

GTTGCCATGC CAGGTGCCGA AGATGATGTG GTGACTCCAG GAACCAGCGA AGACCGCTAT 360 

AAGTCTGGCT TGACAACTCT GGTGGCAACA AGTGTCAACA GTGTAACAGG CATTCGCATC 420 

GAGGATCTGC CAACTTCAGA AAGCACAGTC CAOSOGCAAG AACAAAGTCC AAGOGCCACA 480 

GCCTCAAAOG TGGCCACCAG TCACTCCAC3G GAGAAAGTGG ATGGAGACAC ACAGACAACA 540 

CTTGAGAAAG ATGGTTTGTC AACAGTGACC CTGGTTGGAA TCATAGTTGG GGTCTTACTA 600 

GCCATCGGTT TCATTGGTGG AATCATCGTT GTGGTTATGC GAAAAATGTC GGGAAGGTAC 660 

TCGCCCTAAA GAGCTGAAGG GTTACGCCCT GCTTGCCAAC GTCCTTTAAA AAAAGACOGT 720 

TTCTCACTCT GTGGCCCTGT CCCTGAGCTC GTGGGGAGAA GATGACCCTG GGAACATTTG 780 

GGGGCGCATT CAGATTCCAC GGTGACTTTC CGTTTGCCAA ATTAACCGAG GAAAGACCTT 840 
TCACCAGATT TGGTTCTTAA ACTTT 



Seq ID NO 2 503 Protein sequence 
Protein Accession NP_006465.1 

1 11 21 

} 1 I 

KWKVSAIJiFV LGSASLWVXA EGASTGQPED 
KSGLTTZiVAT SVNSVTGXRI EDLPTSESTV 
VEKDGLSTVT LVGIIVGVLL AIGFIGGIIV 



31 41 51 

I i ) 

DTBTTGLBGO VAMPGABDDV VTPGTSEDRY 
HAQEQSPSAT ASHVATSHST EKVDGDTQTT 
WMRRMSGRY SP 



Seq ID NOi 504 ONA sequence . 

Nucleic Acid Accession S : Eos sequence 

Coding sequence: 62.. 895 

1 11 21 31 41 51 

i - I i 1 I t 

CACTGCTCTG AGAATTTGTG AGCAGCCCCT AACAGGCTGT TACTTCACTA CAACTGACGA 60 
TATGATCATC TTAATTTACT TATTTCTCTT GCTATGGGAA GACACTCAAG GATGGGGATT 120 
CAAGGATGGA ATTTTTCATA ACTCCATATG GCTTGAAOGA GCAGCCGGTG TGTACCACAG 180 
AGAAGCACG6 TCTGGCAAAT ACAAGCTCAC CTAOGCAGAA GCTAAG6CG6 TGTGTGAATT 240 
TGAA0G0G6C CATCTOGCAA CTTACAAGCA GCTA6AGGCA GCCAGAAAAA TTGGATTTCA 300 
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TGTCTGTGCT GCTOGATGGA TQGCTAAGGG CAGAGTTGGA 
GCCCAACTGT GGATTTGGAA AAACTGGCAT TATTGATTAT 
TGAAAGATGG GATCCCTATT GCTACAACOC ACAOGCAAAG 
AGATGCAAAG CAAATTTTTA AATCTCCAGG CTTOCCftAAT 
CTGCTACTGG CACATTACAC TCAAGTATGG TCAGCX3TATT 
TGACXTTGAA GATGACCCAG GTTGCTTGGC TGATTATG7T 
TGATGTCCAT GGCTTTGTGG GAAGATACTG TGGAGATGAG 
TACAGGAAAT GTCATGACCT TGAAGTTTCT AAGTGATGCT 
CCAAATCAAA TATGTTGCAA TQGATCCTGT ATGCAAATCC 
TACTACTTCT ACTGGAAATA AAAACTTTTT AGCTGGAAGA 
AAAAAAAGGA TGATCAAAAC ACACAGTGTT TATGTTGGAA 
CTCACTGTTA TTATTAACAT TTATTT ATTA TTTTTCTAAA 
TAGGGAAAAT TGGAAAATAT AQGAAACTTT AAACG AGAAA 
ACTGCATAGA AATAACAAGC GTTAACATTT TCATATTTTT 
TTTGTGGTAT ATGTATATAT GTACCTATAT G TATTTG CAT 
TCTATGTACA GTTTTGTATT ATACTTTTTA AATCTTGAAC 
TCATTGATTA TTCTACAAAA ACATGATTTT AAACAGCTGT 
TGTTTTATOC ATTATTTAAG OCTGTCTCTA TTGTTGGAAT 
ATTGTTGCAA TAAATATCXIT TGAACACACA AAAAAAAAAA 

Seq ZD NO: SOS Protein sequence 
Protein Accession ft: Eos sequence 



PCTAJS02/12476 



TACOCCATTG 
GGAATOOGTC 
GAGTSTGGTQ 
GAGTAGGAAG 
CAOCTGAGTT 
GAAATATATG 
CTTCCAGATG 
TCAGTGACAG 
AGTCA A OGft A 
TTTAGCCACT 
TCTTTTGGAA 
TGTGAAAGCA 
ATGAAACCTC 
TTCTTTCAGT 
TTGAAATTTT 
TTTATAAACA 
AAAATATTCT 
TTCAGGTCAT 
AA 



TGAAGCCAGG 
TCAATAGGAG 
GCGTCTTTAC 
ATAAOCAAAT 
TTTTAGATTT 

acagttacga 
acatcatcag 
ctggaggttt 
aaaatacaag 
tataaaaaaa 
ctcctttgat 
atacataatt 
tcataatccc 
catttttcta 
ggaatcctgc 

TfTTCTGAAA 
AtGATATGAA 
TTTCATAAAT 



360 
420 
480 
S40 
600 
660 
720 
780 
S40 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



1 

I . 

MIILIYLPLL 
EGGHLATYKQ 
ERWDAYCYNP 
DLEDDPGCLA 
QIKYVAMDPV 



11 
I 

LWSDTQ6HGF 
LEAARXIGPH 
HAKEOGGVFT 
DYVEIYDSYD 
SKSSOGKNTS 



21 31 41 51 

fill 
KDGIFKNSIW LERAAGVYHR EARSGKYKLT YAEAKAVCEF 
VCAAGWMAKG RVGYPIVKPG PNCGPGKTGI IpYGIRLNRS 
DPKQIFKSPG PPNEYEDSQI CYHHIRLKYG QRIHLSFLDP 
DVEGPVGRYC GDEIjPDDIIS TX^NVMTLKPL SDASVTAG6F 
TTSTGNKNFL AGRPSHL 



Seq ID NO: 506 DKA sequence 

Nucleic Acid Accession ft: NM_007ilS.i 

Coding sequence: 6 9.. 902 



GAATTOGCAC 
CTGAOGATAT 
GGGGATTCAA 
ACCACAGAGA 
GTGAATTTGA 
GATTTCATGT 
AGCCAGGGCC 
ATAGGAGTGA 
TCTTTACAGA 
ACCAAATCTG 
TAGATTTTGA 
GTTACGATGA 
TCATCAGTAC 
GAGGTTTCCA 
ATACAAGTAC 
AAAAAAAAAA 
TTGATCTCAC 
TAATTTAGGG 
ATCCCACTGC 
TTGTATTTGT 
.CCTGCTCTAT 
TGAAATCATT 
ATGAATGTTT 
TAAATATTGT 



11 
I 

TGCTCTGAGA 
GATCATCTTA 
GGATGGAATT 
AGCACGGTCT 
AGGOGGCCAT 
CTGTGCTGCT 
CAACTGATGA 
AAGATGGGAT 
TCCAAAGCCA 
CTACTGGCAC 
CCTTGAAGAT 
TGTCCATGGC 
AGGAAATGTC 
AATCAAATAT 
TACTTCTACT 
AAG6ATGATC 
TGTTATTATT 
AAAATTGGAA 
ATAGAAATAA 
GGTATATGTA 
GTACAGTTTT 
GATTATTCTA 
TATGCATTAT 
T6CAATAAAT 



21 
I 

ATTTGTGAGC 
ATTTACTTAT 
TTTCATAACT 
G6CAAATACA 
CTCGCAACTT 
GGATGGATGG 
TTTGGAAAAA 
GCCTATTGCT 
ATTTTTAAAT 
ATTAGACTCA 
GACCCAGGTT 
TTTGTGGGAA 
ATGACCTTGA 
GTTGCAATGG 
GGAAATAAAA 
AAAACACACA 
AACATTTATT 
AATATAGGAA 
CAAGCGTTAA 
TATATGTACC 
GTATTATACT 
CAAAAACATG 
TTAAGCCTGT 
ATCCTTCGGA 



31 

K 

AGCCXrCTAAC 
TTCTCTTGCT 
CCATATGGCT 
AGCTCACCTA 
ACAAGCAGCT 
CTAAGGGCAG 
CTGGCATTAT 
ACAACCCACA 
CTCCAGGCTT 
AGTATGGTCA 
GCTTGGCTGA 
GATACTGTGG 
AGTrrCTAAC 
ATCCTGTATC 
ACTTTTTAGC 
GTGTTTATGT 
TATTATTTTT 
ACTTTAAAOG 
CATTTTCATA 
TATATGTATT 
TTTTAAATCT 
ATTTTAAACA 
CTCTATTGTT 
ATTC 



41 

I 

AGGCTGTTAC 
ATGGGAAGAC 
TGAACGAGCA 
aSCAGAAGCr 
AGAGGCAGCC 
AGTTGGATAC 
TGATTATGGA 
CGCAAAGGAG 
CCCAAATGAG 
GOGTATTCAC 
TTATGTTGAA 
AGATGAGCTT 
TGATGCTTCA 
CAAATCCAGT 
TGGAAGATTT 
TGGAATCTTT 
CTAAATGTGA 
AGAAAATGAA 

' mrri ' i ' tJi ' T 

TGCATTTGAA 
TGAACTTTAT 
GCTGTAAAAT 
GGAATTTCAG 



51 
I 

TTCACTACAA 
ACTCAAGGAT 
GCCGGTGTGT 
AAGG06GTGT 
AGAAAAATTG 
CCCATTGTGA 
ATCOGTCTCA 
TGTGGTGGCG 
TACGAAGATA 
CTGAGTTTTT 
ATATATGACA 
CCAGATGACA 
GTGACAGCTG 
CAAGGAAAAA 
AGCCACTTAT 
TGGAACTCCT 
AAGAAATACA 
ACCTCTCATA 
TCAGTCATTT 
ATTTTGGAAT 
GAACATTTTC 
ATTCTATGAT 
GTCATTTTCA 



1 
1 

ACCGCTCCGG 
AAAGCCCAGG 
GTGTGCCCAT 
CTAAGGAGCC 
AGGGAGTGCA 
AGGATOGGGA 
TGGACCTGGC 



11 

I 

AGOGGGAGGG 
CCCGGGOGGC 
GAGTAAGAGC 
CAATGCCGTG 
GCTCACCAGC 
GAOCTGGOGC 
CAAGGTCTGG 



21 

I 

GAGGCTTCGC 
CAGACCAAGA 
AAATGCTCCG 
GGCCCGAAGG 
TCCACCCTCA 
AAGAAGATC6 
CGGTTCCCCT 



31 
I 

GGAACGCTCT 
GGGAAGAAGC 
TGGGACTCAT 
AGGTGGAGCT 
CCAACCCGOS 
ACTTTCTOCT 
ACCTGTGCTA 



41 
I 

CGGCX3CCAGG 
ACA6AATTCC 
GTCTTCCGTG 
CATCCTTGTC 
GCAGAGCCCC 
GTCOGTCATT 
CAAAAATGGT 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



Seq ID NO: S07 Protein sequence 
Protein Accession »: NP_009046.1 

1 11 21 31 41 SI 

) i t i I 1 

MIILIYIiFLL LWEDTQGWGP KDGIFHNSIW LERAAGVYHR EARSGKYKLT YAEAKAVCEF 
EGGHLATYKQ LEAARKIGFH VCAAGWMAKG RVGYPIVKPG PNXXFGKTGX ZDYGIRLHRS 
ERWDAYCYNP HAKECGGVPT DPKRIFKSPG FPNEYEDNQI CYWHIRLKYG QRIHLSFLDP 
DLEDDPGCLA DYVEIYDSYD DVHGFVGRYC GDELPDDIIS TGNVMTLKPL SDASVTAGGF 
QIKYVAMDPV SKSSQGKNTS TTSTGNKNFL AGRFSHL 



Seq ID NO: 508 DNA sequence 

Nucleic Acid Accession S: NM_001044.1 

Coding sequence : 12 9 1 99 1 



60 
120 
180 
240 



51 
I 

ACTC6CGTGC 
TCAACTCCCA 
GTGGCCCCGG 
AAGGAGCAGA 
GTGGAGGCGC 
GGCTTTGCTG 
GGCGGTGCCT 



60 

120 
180 
240 
300 
360 
420 
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TCCTGGTOCC CTACCTGCTC TTCATGGTCA. TTKTGGGAT GOCACTTTTC TACaTGGAGC 480 

TGGCCCT O GG CCAGTTCAAC AGGGAAGGGG C0GC1X3GTGT CTGGAAGATC TGCCCCATAC 540 

TGAAAGGTGT GGGCTTCACG GTCATCCTCA TCTCACTGTA TGTOSGCTTC TTCTACAACG 600 

TCATCATOGC CTGGGOGCTG CACTATCTCT TCrcCTCCTT CACCACGGAC CTOXCrGGA 660 

TCCRCTGCAA CAACTCCTGG AACAGCCCCA ACTGCTOQGA TGCCCATCCT GGTGACTCCA 720 

GTQQA6ACAG CTOSGGCCTC AAG6ACACTT TTGGGAOCAC ACCTGCTGCC GAGTACTTTG 780 

AACGTG G OG T GCTGCACCTC CACCAGAGCC ATGGCATCGA CGACCTGG6G CCiOJG O G G T 840 

GGCAGCrCAC AGCCTGCCTG GTGCTGGTCS^ TOSTGCTGCT CTACTTCAGC CTCTGGAAGG 900 

GOGTGAAGAC CTCAGGGAAG GTGGTATGGA TOVCAGCCAC CATGCCATAC GTGC>7CCTCA 960 

CTGOCXSGCT CCTGOGTGGG GTCACCCTCC CTGGAGCCAT AGAOGGCATC AGAGCATAiCC 1020 

TGAGCGTTGA CTTCTACCGG CTCTGCGAGG CGTCTGTTTG GATTGACGOG GCCACCCAGG 1080 

TGTGCTTCTC CCTGGGCGTG GGGTTOGGGG TGCTGATOGC CTTCTCCAGC 5ACRACAAGT 1140 

TCACCAACAA CTGCTACAGG GAOGOGATTG TCACCACCTC CATCAACTCC CTGAOGAGCT 1200 

TCTCCTCOGG CTTOGTOGTC TTCTCCTTCC TGGGGTACAT GGCACAGAAG CACAGTGTGC 1260 

OCATCGGGGA OGTGGCCAAG GACGGGCCAG GGCTGATCTT CATCATCTAC CCGGAAGCCA 1320 

TOGCCAOGCT CCCTCTGTCC TCAGCCTGGG COGTGGTCTT CTTCATCATG CTGCTCACCC 1380 

TGGGTATCGA CAGOGCCATG GGTGGTATGG AGTCAGTGAT CACOGGGCTC ATCGATGAGT 1440 

TCCAGCTGCT GCACAGACAC GGTGAGCTCT TCAOOCTCTT CATCGTCCTG GCGACCTTCC 1500 

TCCTGTCCCT GTTCTGCGTC ACCAAGGOT6 GCMCTAaST CTTCAGGCTC CTGGACCATT 1560 

TTGCAGCCGG CACGTCCATC CTCTTTGGAG TGCTCATCGA AGCCATCGGA GTGGCCTGGT 1620 

TCTATGGTCT TGGGCAGTTC AGCGACGACA TCCAGCAGAT GACCGGGCAG CGGCCCAGCC 1680 

TGTACTGGCX; GCTGTGCTCG AAGCIGGTCA GCCCCTGCTT TCTCCTGTTC GTGGTCGTGG 1740 

TCAGCATTGT GACCTTCAGA CCCCCCCACT ACGGAGCCTA CATCTTCCCC GACTGGGCCA 1300 

A0QOGCTGG6 CTOGGTCATC GCCACATCCT OCATG6CCAT GGTGCCCATC TATGOGGCCT 1860 

ACAAGTTCTG CAGCCTCOCT GGGTCCTTTC GAGAGAAACT GGCCTAOGCC ATTGCACCX3G 1920 

AGAAGGACOG TGAGCTGGTG GACAGAGGGG AGGTGCGCCA GTTCAOGCTC OSCCACTGGC 1980 

TCAAGGTGTA GAGGGAGCAG AGACGAAGAC CCCAGGAAGT CATCCTGCAA TGGGAGAGAC 2040 

AGGAACAAAC CAAGGAAATC TAAGTTTCGA GAfiAAAGGAG GGCAACTTCT ACTCTTCAAC' 2100 

CTCTACTGAA AACACAAACA ACAAAGCAGA AGACTCCTCT CTTCTGACTG TTTACACCTT 2160 

TGOGTGCCXSG GAGCGCACCT OGCCGTGTCT TGTGTTGCTG TAATAACGAC GTAGATCTGT 2220 

GCAGCGAGGT CCACCCOGTT GTTGTCCCTG CAGGGCAGAA AAACGTCTAA CTTCATGCTG 2280 

TCTGTGTCAG GCTCCCTCCC TCCCTGCTCC CT G CTCCOGG CTCTGAGGCT GGCOCAQGGG 2340 

CACTGTGTTC TCAGGCGGG6 ATCAC6ATCC TTSTAGACGC ACCTGCTGAG AATCCCOGTG 2400 

CTCACAGTAG CTTCCTAGAC CATTTACTTT GCCCATATTA AAAAGCXAAG TGTCCTGCTT 2460 

GGTTTAGCTG TGCAGAAGGT GAAATGGAGG AAACCACAAA TTCATGCAAA GTCCTTTCCC 2520 

GATGCGTGGC TCCCAGCAGA GGCCGTAAAT TGAGCGTTGA GTTGACACAT TGCACACACA 2580 

GTCTGTTCAG AGGCATTGGA GGATGGGGGT CCTGGTATGT CTCACCAGGA AATTCTGTTT 2640 

ATGTTCTTGC AGCAGAGAGA AATAAAACTC CTT6AAACCA GCTCAGGCTA CTGCCACTCA 2700 

GGCAGCCTGT GGGTCCTTGT GGTGTAGGGA ACGGCCTGAG AGGAGCGTGT CCTATCCCCG 2760 

GACGCATGCA GGGCCCCCAC AGGAGCGTGT CCTATCCCCG GACGCATGCA GGGCCCCCAC 2820 

AGGAGCATGT CCTATCCCTG GACGCATGCA GGGCCCCCAC AGGAGCGTGT ACTACCCCAG 2880 

AACGCATGCA GGGCCCCCAC AGGAGCGTGT ACTACCCCAG GACGCATGCA GGGCCCCCAC 2940 

TGGAGGGTGT ACTACCCCAG GACGCATGCA GGGCCCCCAC AGGAGCGTGT CCTATCCCCG 3000 

GACCGGAOGC ATGCAGGGOC CCCACAGGAG CGTGTACTAC CCCAGGACGC ATGCAGGGCC 3060 

CCCACAGGA6 CGTGTACTAC CCCAGGATGC ATGCAGGGCC CCCACAGGAG OGTGTACTAC 3120 

CCCAGGACGC ATGCAGGGCC CCCATGCAGG CAGCCTGCAG ACCAACACTC TGCCTGGCCT 3180 

TGAGCCGTGA CCTCCAGGAA GGGACCCCAC TGGAATTTTA TTTCTCTCAG GTGCGTGCCA 3240 

CATCAATAAC AACAGTTTTT ATGTTTGCGA ATGGCTTTTT AAAATCATAT TTACCTGTGA 3300 

ATCAAAACAA ATTCAAGAAT GCAGTATCCG CGAGCCTGCT TGCTGATATT GCAGTrTTTG 3360 

TTTACAAGAA TAATTAGCAA TACTGAGTGA AGGATGTTG6 CCAAAAGCTG CTTTCCATGG 3420 

CACACTGCCC TCTGCCACTG ACAGGAAAGT G6ATGCCATA GTTTGAATTC ATGCCTCAAG 3480 

TOGGTGGGCC TGCCTACGTG CTGCCCGAGG GCAGQGGCCG TGCAGGGCCA GTCATGGCTG 3540 

TCCCCTGCAA GTGGACGTGG GCTCCAGGGA CTGGAGTGTA ATGCTCGGTG GGAGCCGTCA 3600 

GCCTGTGAAC TGCCAGGCAG CTGCAGTTAG CACAGAGGAT GGCTTCCCCA TTGCCTTCTG 3660 

GGGAGGGACA CAGAGGAOG6 CTTCCCCATC GCCTTCTGGC CGCTGCAGTC AGCACAGAGA 3720 

GCGGCTTCCC CATTGCCTTC TGGGGAGGGA CACAGAGGAC AGTTTCCCCA TCGCCTTCTG 3780 

GTTGTTGAAG ACAGCACAGA GAGCGGCTTC CCCATCGCCT TCTGGGGAGG GGCTCOGTGT 3840 

AGCAACCCAG GTGTT6TC0G TGTCTGTTGA CCAATCTCTA TTCA6CATGG TGTG6GTGCC 3900 
TAAGCACAAT AAAAGACATC CACAATGGAA AAAAAAAAAG GAATTC 

Seq ID NO: 509 Protein sequence 
Protein Accession NP_001 035.1 

1 11 21 31 41 51 

I i I ) I I 

MSKSKCSVGI* MSSWAPAKE PNAVGPKEVE LILVKEQNGV QLTSSTLTNP RQSPVEAQDR 60 

ETWGKKIDPL LSVIGPAVDL ANVWRPPYLC YKNGGGAFLV PYliLFMVIAG MPLFYMELAL 120 

GQFNREGAAG VWKICPIIiKG VGFTVII/ISL YVGFFYNVII AWALHYLFSS FTTELPWIHC 180 

MNSHNSPKCS DAaPGDSSGD SSGLNDTFGT TPAAEYFCRG VLHLHQSHGI DDLGPPRWQL 240 

TACLVLVIVL LYFSLWKGVK TSGKWWITA TMPYVVLTAL liLRGVTLPGA IDGIRAYLSV 300 

DFYRLCEA5V HIDAATQVCP SLGVGFGVLI AFSSYNKFTN NCYRDAIVTT SINSLTSFSS 360 

GFWFSPLGY MAQKHSVPZG DVAKDGPGLZ FIIYPEAZAT LPLSSAHAW FFIMLLTLGI 420 

DSAMGGMESV ITGLIDEFQIi LHRHRELPTL PIVLATPLLS LPCVTNGGIY VFTLLDHPAA 480 

GTSILFGVLI EAIGVAWPYG VGQFSDDIQQ MTGQRPSLYW RLCWKLVSPC FLLPWWSI 540 

VTFRPPHYGA YZPPDWAHAL GWVIATSSMA MVPIYAAYKF CSLPGSFREX LAYAIAPBKD 600 
RELVDRGEVH QFTLRHHIiXV 

Seq IT) NO: 510 DNA sequence 

Nucleic Acid Accession #: NM_001216.1 

Coding sequence: 4 3.. 1422 ~ 

1 11 21 31 41 51 

1 i 1 1 I I 

GCCCGTACAC ACCGTGTGCT GGGACACCCC ACAGTCAGCC GCATGGCTCC CCTGTGCCCC 60 

AGCCCCTGGC TCCCTCXGTT GATCCCGGCC OCTGCTCCAG GCCTCACTGT GCAACIGCTG 120 

CTGTCACTGC TGCTTCTGAT GCCTGTOCAT CCCCRGAGGT TGCCCCG6AT GCftGGAGBAT 180 

TCCCCCTTGG GAGGAGGCTC TTCTGGGGAA GATGACCCAC TGGGOGAGGA GGATCTGCCC 240 
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AGIGAAGAGG ATTCAGOCAG AGAOGAGGAT OCAOCCGGAG AGGAGGATCT ACCTGGAGAG 300 
GAGGATCTAC CTGGAGAGGA OGATCTACCT GAAGTTAAGC CTAAATCAGA AGAAGIGGGC 360 
TCCCTGAAGT TAGACGATCT ACCTACTGTT GAGGCTOCTG GAGATCCTCA AGAACCCCAG 420 
AATAATGCCC ACAGGGACAA AGAAGGGGAT GACCAGAGTC ATTGGCGCTA TGGAGGOGAC 480 
COSCCCTGGC CCOGGGTGTC CCXaGCCTGC GOGGGCCGCT TCCaCTCCCC GGTGGATATC 540 
GGCCCCCAGC T0600G0CTT CPGCC O GGCC CTGGGGCCOC TGGAACTCCT GGGCTTOCAG 600 
CTUCO G OCXSC TCCCAGAACT G0G0CTGG6C AACAATGGCC ACAGTGtGCA ACIGAOOCTG 660 
CCTCCTGGGC TAGAGATGGC TCTGGGTOCC GGGCGGGAGT ACOGGGCTCT GCAGCTGCAT 720 
CTGCACTGGG GGGCTGCACG TCGTCOGGGC TCGGAGCACA CTGTGGAAGG CCACCGTTTC 780 
CCTGCCCAGA TCCAOGTGGT TCACCTCAGC ACOGCCTTTG CCAGAGTTGA OGAGGCCTTG 840 
GGGOGCCCGG GAGGCCTGGC CXTPGTTGGCC GCCTTTCTGG AGGAGGGCCC GGAAGAAAAC 900 
A£?rGCCTATG AGCRGTTGCT GTCTCGCTTG GAAGAAATOG CTGAfiGAAGG CTCAGAGACT 960 

CAGGTCCCAG GACTGGACAT ATCTGCACTC CTGCCCTCTC ACTTCAGCC5 CTACTTOCAA 1020 

TATGAGGGGT CTCTGACTAC AC0C5CCCTGT GCCCAGGGTG TCATCTGGAC TGTGTTTAAC 1080 

CAGACAGTGA TGCTGAGTGC TAAGCAGCTC CACACCCTCT CTGACACCCT GTGG6GA0CT 1140 

GGTGACTCTC GGCTACAGCT GAACTTCOGA GOGAOGCAGC CTT7GAATGG GCGAGTGATT 1200 

GAGGCCTCCT TCCCTGCTGG AGTGGACAGC AGTCCTOGGG CTGCTGAGCC AGTCCASCTG 1260 

AATTCCTGCC TGGCTGCTGG TGACATOCTA GCCCTGGTTT TTGGCCTOCT 'ITfT OCPG TC 1320 

ACCAGOGTOG OGTTCCTTGT GCAGAtGAGA AGGCAGCACA GAAGGGGAAC CAAAOGGGGT 13 BO 

GTGAGCTACX: GOCCAGCAGA GGTAGC06AG ACTGGAGCCT AGAGGCTGGA TCTTGGAGAA 1440 

TGIGAGAAGC CAGCCAGAGG CATCTGAG6G GGAGCOGGTA ACTGTCCTGT CCTGCTCATT 1500 
ATGCCACTTC CTTTTAACTG CCAAGAAATT TTTTAAAATA AATATTTATA AT 

Seq ID NO: 511 Proteia sequence 
Protein Accession Ui NP_001207.l 

1 11 21 31 41 51 

r I' II 1 ■ I 

MAPLCPSPWL PLtilPAPAPG LTVQLLLSLL LLMPVHPQRL PRMQEDSPLG GGSSGEDDPL 60 
GEEDLPSEED SPREEDPPGE EDLPGEEDIiP GEEDLPEVKP KSEEEGSLKL EDLPTVEAPG 120 
DPQEPQNNAH RDKEGOOQSH WRYGGOPPWP RVSPACAGRP QSPVDIRPQL AAFCPALRPL 180 
EUiGFQLPPL PEIiRLRNSQ! SVQLTLPPGZi EMALGPGRSY KALQXiHLHHG AAGRPGSEHT 240 
VEGHRPPAEI HWHLSTAFA RVDEALGRPG GLAVLAAFLE EGPEENSAYE QLLSRLEEIA 300 
EBGSETQVPG LDISALLPSD FSRYPQYEGS LTTPPCAQGV IWTVPNQTVM LSAKQLHTIiS 360 
DTLWGPGDSR LQLNFRATQP LNGRVIEASF PAGVDSSPRA AEPVQLHSCL AAGDIJjAIjVF 420 
GIiLFAVTSVA FLVQMRRQHR RGTKGGVSYR PAEVAETGA 

Seq ID KO: 512 DNA sequence 

Nucleic Acid Accession ft: Eos sequence 

Coding sequence: 1..3978. 

1 11 21 31 41 51 

I ) 1 } 1 i 

ATGGTGGGTG AAGGACCCTA CCTTATCTCA GATCTGGACC AGCGAGGCX3G GCGGAGATCC 60 
TTTGCAGAAA GATATGACCC CAGCCTGAAG ACCATGATCC CAGTGCGACC CTGTGCAAGG 120 
TTAGCACCCA ACCCGGTGGA TGATGCCGGG CTACTCTCCT TCGCCACATT TTCCTGGCTC 180 
ACGCCGGTGA TGGTGAAAGG CTACCGGCAA AGGCTGACCG TAGACACCCT GCCCCCATTG 240 
TCGACATATG ACTCATCTGA CACCAATGCC AAAAGATTTC GAGTCCTTTG GGATGAAGAG 300 
GTAGCAAGGG TGGGTCCTGA GAAGGCCTCT CTGAGCCACG TGGTGTGGAA ATTCCAGAGG 360 
ACACGCGTGT TGATGGACAT CGTGGCCAAC ATCCTGTGCA TCATCATGGC AGCCATAGGG 420 
CCGACAGTTC TCATTCACCA AATCCTCCAG CAGACTGAGA GGACCTCTGG GAA AGTCT GG 480 
GTTGGCATTG GACTGTGCAt' AGCCCTTTTT GCCACCGAGT TTACCAAAGT CTTCTTTTGG 540 
GCC3CTTGCCT GGGCCATCAA CTACXXSCACG GCCATCOGGT TGAAGGTGGC GCTCTCCACC 600 
TTGGTTTTTG AAAACCTAGT GTCCTTCAAG ACATTGACCC ACATCTCTGT TGGCGAGGTG 660 
CTCAATATAC TGTCAAGTGA TAGCTATTCT TTGTTTGAAG CTGCCTTGTT TTGTCCTTTG 720 
CCAGCCACCA TCCCGATCCT AATGGTCTTT TGTGCGGCGT ACGCCTTTTT CATTCTGGGG 780 
CCCACAGCTC TCATCGGGAT ATCAGTGTAT GTCATATTCA TACCCGTCCA GATGTTTATG 840 
GCCAAGCTCA ATTCAGCTTT CGQAAGGTCA GCAATTTTGG TGACAGACAA G0GAGTTCA6 900 
ACAATGAATG AGTTTCTGAC CTGCATCAGG CTGATCAAAA TGTATGCCTG GGAGAAATCT 960 

TTTACCAACA CTATCCAAGA TATAAGAAGG AGGGAAAGAA AATTACTGGA AAAAGCTGGA 1020 

TTTGTCCAAA GTGGAAACTC TGCCCTGGCC CCCATCGTGT CCACCATAGC CATOGTGCTG 1080 

ACATTATCCT GCCACATCCT CCTGAGACGC AAACTCACCG CACCCGTGGC ATTTAGTGTG 1140 

ATTGCCATGT TTAATGTAAT GAAGTTTTCX: ATTGCAATCT TGCCCTTCTC CATCAAAGCA 1200 

ATGGCTGAAG OGAATGTCTC TCTAAGGAGA ATGAAGAAAA TTCTCATAGA TAAAAGCCCC 1260 

CCATCTTACA TCACCCAACC AGAAGACCCA GATACTGTCT TGCTTTTAGC AAATGCCACC 1320 

TTGACATGGG AGCATGAAGC CAGCA06AAA AGTAGCCCAA AGAAATT6CA GAACCAGAAA 1380 

AGGCATTTAT GCAAGAAACA 6AGGTCAGAG GCATACAGTG AGAGGAGTCC ACCAGCCAAG 1440 

GGAGCCACTG GCCCAGAGGA GCAAAGTGAC AGCCTC31AAT CGGTTCTGCA CAGCATAAGC 1500 

TTTGTGGTGA GAAAGTTATG TCGTTATCCC GAAGCCCAGC TCCTGGCTTG GACGTGGCCA 1560 

GCAGTGTTTG TTGG6AGAAT CATCAGAGGA TACAGGCCTC ATGGATTTTC TGCTAAAGAC 1620 

AAGGATGAAT CTAGAAGGCT TCTTACTTGG CCGCAAGAAG TGGATAOGAC TCAAAGGGCA 1680 

GCCAAATACC TCXXX3AAGAT CTTGGGAATA TGTGGGAATG TGGGAAGIGG AAAGAGCTCC 1740 

CTCCTTGCAG CTCTCCTAGG ACAGATGCAG CTGCAGAAAG GGGTGGTGGC AGTCAATGGA 1800 

ACTTTGGCCT ACX3TTTCACA GCAGGCATGG ATCTTTCATG GAAATGTGAG AGAAAACATA 1860 

CrCTTTGGAG AAAAGTATGA TCACCAAAGG TATCAGCACA CAGTCCGCGT CTGTGGCCTC 1920 

CAGAAGGACC TGAGCAACCT CCCCTATGGA GACCTGACTG AGATTGGGGA GCGGGGCCTC 1980 

AACCTCTCTG GGGGGCAGAG GCAGAGGATT AGCCTGGCCX: GCGCTGTCTA CTCOCSACCGT 2040 

CAGCTCTACC TGCTG6AGGA CCCCCTGTCG GCOGTGGACG CCCAOGTGGG GAAGCACGTC 2100 

TTTGAGGAGT GCATTAAGAA GACGCTCAGG GGAAAGACA6 TCGTCCTGGT GACCCACCAG 2160 

CTACAGTTCT TAGAGTCTTG TGATGAAGTT ATTTTATEAG AAGATGGAGA GATTTGTGAA 2220 

AAGGGAACCC ACAAGGAGTT AATGGAGGAG AGAGGGCGCT ATGCAAAACT GATTCACAAC 2280 

CTGOGAGGAT TGCAGTTCAA GGATCCTGAA CACCTTTACA ATGCAGCAAT GGTGGAAGCC 2340 

TTCAAGGAGA GCCCTGCTGA GAGAGAGGAA GATGCTGGTA TAATCGGGTA CCTCCTTTCT 2400 

CTCTTCACTG 'i V n'CCrCfT CCTCCTGATG ATTGGCAGOG CTGCCTTCAG CAACTGGTGG 2460 

CTGGGTCrCT GGTTGGACAA GGGCTCAGGG ATQACCIGT6 GGCCCCAGGG CAACAGGAOC 2520 

ATGTGTGAGG TOGGOGGGGT GCTGGCAGAC ATOGGTCAGC ATGTGTACCA GTGGGTGTAC 2580 

ACTGCAAGCA T6GTGTTCAT GCTGGTGTTT GGC6TCACCA AAGGCTTOGT CTTCACGAAG 2640 
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AGCACACTGA TGGOITOCTC CTCTCTGOVT GACAOSGTGT TTGATAAGftT CTTAAAGAGC 2700 

CCAAPGAGTT TCTTTGACAC GACTCCCaCT GGCAGGCTAA TGAACCGTTr TTCCAAQGAT 2760 

ATGGACGAGC TGGATGTGAG GCTGCOGTTT CAOGCACAGA ACTTTCTGCA GCAGTTTTTT 2820 

ATC3GTGGTGT TTATTCTOGT GATCTTGGCT GCTGTGTTTC CTGCTCTCCT TTTAGTOGTG 2880 

GCCAGCCTTG CTGTAGGCTT CTTCATTCTG TTACGCATTT TCCACAGAGG AGTCCAGGAG 2940 

CTCAAGAAG6 TGGAGAAtGT CAGGOGGTCA GCCTGGTTCA CCCACATCAC CIGCTCCATG 3000 

CAOOGCCTGG GCATC317TCA CGCCTATGGC AA6AAGGAGA GCTGCATCAC CTATACTTCA 3060 

TCCAAAGGCC TGTCaTTGTC ATACATCATC CAGCTGAGCG 6ACTGCTOCA AGTGTGTGTG 3120 

CGAAGGGGAA CAGAGACGCA AGCCAAATTC AOCTCOGTGG AGCTGCTCAG GGAATACATT 3180 

TCGACCTGTG TTCCTGAATG CACTCATCCC CTCAAAGTGG GGACCTGTCC CAAGGACTGG 3240 

CCCAGCTGTG GGGAGATCAC CTTCAGAGAC TATCAGATGA GATACAGAGA CAACACCCCC 3300 

CTTGTTCTCG ACAGCCTGAA CTTGAACATA CAAAGTGGGC AGACAGTOGG GATTGTTGGA 3360 

AGAACAGGTT COGGAAAGTC ATCGTTAGGA ATGGCTTTGT T lX S SVCL X S G t GGAGCCAGCC 3420 

AGTGGCACAA TCTTTATTGA TGAGGTGGAT ATCTGCATTC TCAGCTTGGA AGACCTCAGA 3480 

ACCAAGCTGA CTGTGATCCC ACAGGATCCT GTOCTGTTTG TAGGTACAGT AAGGTACAAC 3S40 

TTGGATCCCT TT6AGAGTCA CACCGATGAG ATGCTCTGGC AGGTTCTGGA GAGAACATTC 3600 

ATGAGAGACA CAATAATGAA ACTCCXa^GAA AAATTACAGG CAGAAGTCAC AGAAAATGGA 3660 

GAAAACTTCT CAGTAGGGGA ACGTCACCTG CTTTGTGTGG COCGAGCTCT TCTCCGTAAT 3720 

TCAAAGATCA TTCTOCTTGA tGAAGGCAOC GOCTC TA TGG ACTCCAAGAC TCACACOCTG 3760 

GTTCA6AACA CCATCAAAGA T6GCTTCAAG GGCTGCACTG TGCTGACCAT GGCCCACCGC 3840 

CTCAACACAG TTCTCAACTG OGATCAOGTC CTGGTTA7GG AAAATGGGAA GGTGATT6AG 3900 

TTTGACAAGC CTGAAGTCCT TGCAGAGAAG CCAGATTCTG CATTTGOGAT GTTACTAGCA 3960 
CCAGAAGTCA GATTGTAG 

Seq ID MO: 513 Protein sequence 
Protein Accession ft: Eos sequence 

1 11 " 21 ■ 31 " 41 51 

1 I I i I I 

KVGEGPYLIS DLDQRGRRRS FAERYDPSLK TMIPVRPCAR LAPNPVDDAG LLSFATFSWL 60 

TPVMVKGYRQ RLTVDTLPPL STYDSSDTNA KRFRVLWDEE VARVG9SKAS LSHWWKFQR 120 

TRVLMDIVAN ILCIIKAAXG PTVI.IBQXLQ QTERTSGXVW VGIGLCIALF ATHFTKVFFW 160 

ALAWAINYRT AIRLKVALST LVFEKLVSPK TLTHISVGEV LNILSSDSYS LFEAALFCPL 240 

PATIPILMVP CAAYAPPILG PTALIGISVY VIPIPVQMFM AKLNSAFRRS AILVTDKRVQ 300 

TMNEFLTCIR LIKMYAWEKS FTNTIQDIRR RERKLLEKAG PVQSGNSALA PIVSTIAIVL 360 

TLSCHILIiRR KLTAPVAPSV lAMFNVMKFS lAILPFSIKA MAEANVSXiRR MKKILIDKSP 420 

PSYITQPEDP DTVLLLANAT LTWEHEASRK STPKKLQNQK RHLCKKQRSE AYSERSPPAK 480 

GATGPEEQSD SLKSVLHSIS FWRKLCRYP BAQLLAWRWP AVFVGRIIRG YRPHGFSAKD 540 

KDESRRLLTW PQEVDRTQRA AKYItGKZUSI C(^GSGKSS LLAALLGQMQ liQKGWAVNG 600 

TZiAYVSOQAW IFKGNVRENI IiFGEKYDHQR YQHTVRVOGL QKOLSNLPYG DLTEIGERGL 660 

IJLSGGQRQRI SLARAVYSDR QLYLLODPLS AVDAHVGKHV FEECIKKTLR GKTWLVTHQ 720 

LQPLESCDEV ILLEDGEICE KGTHKELMEE RGRYAKLIHN LRGLQFKDPE HLYNAAMVEA 780 

PKESPAEREE DAGIIGYLLS LFTVPLFLLM IGSAAPSNWW LGLWLDKGSR MTCGPQGNRT 840 

MCEVGAVLAD IGQHVYQWVY TASMVFNLVF GVTKGFVFTK TTLMASSSLH DTVPDKILKS 900 

PMSPFX>TTPT GRLMNRFSKD MDELDVRZiPF HAEHFLQQFF KWFZLVILA AVFPAVLLW 960 

ASLAVGFFIL LRIFHRGVQB LKKVENVSRS PWFTHITSSM QGZ/3IIKAYG KXESCZTYTS 1020 

SKGLSLSYII QLSGLLQVCV RTGTETQAKF TSVBLIiREYI STCVPECTHP LKVGTCPKDW 1080 

PSCGEITPRD YQMRYRDNTP LVLDSLNLill QSGQTVGIVG RTGSGKSSLG MAI4FRLVEPA 1140 

SGTIPIDEVD ICILSLEDLR TKLTVIPQDP VLFVGTVRYM LDPFESHTDE MLWQVIjERTF 1200 

MRDTIMKLPE KLQAEVTEMG EHFSVGSRQL LCVARALLRN SKZILLDEAT ASKDSKTDni 1260 

VQNTtKDAFK GCTVLTZAHR LNTVLNGDHV LVMEKGKVIE FDXPEVLAEK TOSAFAMZtLA 1320 
AEVRL 

Seq ID NO: 514 DNA sequence 
Nucleic Acid Accession #: Z31560 
Coding sequence! l>966 

1 11 21 31 41 51 

1 I I I I I 

CACAGCGCCC GCATGTACAA CATGATGGAG AOGGAGCTGA AGCCGCCGGG CCOGCAGCAA 60 

ACTTCGGGGG GCGG0GGCX3G CAACTCCACC GCGGCGGCGG COGGCGGCAA CCAGAAAAAC 120 

AGCCCGGACC GOGTCAAGCG GCCCATGAAT GCCTTCATGG TGTGGTCCCG CGGGCAGCGG 180 

CGCAAGATGG CCCAGGAGAA CCCCAAGATG CACAACTOGG AGATCAGCAA GCGCCTGGGC 240 

GCCGAGTGGA AACTTTTGTC GGAGACGGAG AAGCGGCCGT TCATCGACGA GGCTAAGCGG 300 

CTGCGAGGGC TGCACATGAA GGAGCACCCG GATTATAAAT ACXX3GCCC0G GOGGAAAACC 360 

AAGACGCTCA TOAAGAAGGA TAAGTACAOG CT6CC0GGGG GGCTCCTG6C CCCCGGCGGC 420 

AATAGCAT6G OGAGCGGGGT GGGGGTGGGC GCOSGCCTGG GCGGGGGCGT GAACCAGCGC 480 

ATGGACAGTT ACKCGCACAT GAACXSGCTGG AGCAACGGCA GCTACAGCAT GATGCAGGAC 540 

CAGCTGGGCr ACCGGCAGCA CCCGGGCCTC AATGCGCACG GOGCAGCGCA GATGCAGCCC 600 

ATGCACCGCr ACGACGTGAG CGCCCTGQIG TACAACTCCA TGACCAGCTC GCAGACCTAC 660 

ATGAACGGCT CGCCCACCTA CAGCATGTCC TACTCGCAGC AGGGCACCCC TGGCATGGCT 720 

CTTGGCTCCA TGGGTTCGGT GGTCAAGTCC GAGGCCAGCT CCAGCCCCCC TGTGGTTACC 780 

TCTTCCTOCC ACTCCAGGGC GCCCTGCCAG GCCGGGGACC TCCXX3GACAT GATCAGCATG 840 

TATCTCCCCG G0GC06AGGT GCOGGAACCC GCCGCCCCCA GCAGACTTCA CATGTCCCAG 900 

CACTACCA6A GOGGCCCGGT GCCOGGCAOG GCCATTAACG GCACACTGCC CCTCTCACAC 960 

ATGTGAGGGC CGGACAGOGA ACTGGAGGGG GGAGAAATTT TCAAAGAAAA AGGAG6GAAA 1020 

TGGGAGGGGT GCAAAAGAGG AGAGTAAGAA ACAGCATGGA GAAAACCXIGG TA06CTCAAA 1080 
AAAAA 

Seq ZD HO: 515 Protein sequence 
Protein Accession ft: CAA83435 

1 11 21 31 41 51 

I t I I I I 

USARMYMMME TELKPPGPQQ TSG6GGGNST AAAAGGKQXN SPORVRRPKN AFMWSRGQR 60 

RKMAQENPKM KNSEISKRLG AEWKLLSETB XRPFIDEAKR LRAZJIKKEHP SYKYRPRRRT 120 

KTLNKKDKYT LPG6Z1IAPGG NSt4ASGV6VG AGIiGAGVNQR MDSYAHMNGW SNGSYSHMQD 180 
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QLGYPQHPGL NAHGAAQMQP MHHYDVSAIiQ YNSMTSSQTY MNGSPTYSMS YSQQGTPQ4A 240 
LGSMSSWKS EASSSFFWT SSSHSRAPOQ AGDLBOKXSM YLFGAEVPEP AAPSRLHMSQ 300 
EYQSGPVPGT AIHGTI*PLSB M 

Seq ID NO: S16 OSA sequence 
Nucleic Acid Accession 9: U91618 
Coding sequence: 29. -541 

1 11 21 31 41 SI 

i ) I I I I 

CGGACTTGGC TTGTTACAAG GCTGAAAGAT GATGGCAGGA ATQAAAATCC AGCTTGTATG 60 

CATGCPACTC CTGGCTTTCA CCTCCTGGAG TCTGrTGCTCA GATTCAGAAG A<3GAAATGAA 120 

AGCATTAGAA GCAGATTTCT TGACX^ATAT GCATACATCA AAGATTAGTA AAGCACATGT 180 

TCCCTCrrGG AAGATGACTC TGCTAAATGT TTGCAGTCTT GTAAATAATT TGAACAGOX 240 

AGCTGAGGAA ACAGGAGAAG TTCATGAAGA GGAGCTTGTT GCAACAAGGA AACTTCCTAC 300 

TGCTTTAGAT G6CTTTAGCT T GGAA GCftAT GTTGACAATA TACCAGCTCC ACAAAATCT6 360 

TCACAGCAGG CCTTTTCAAC ACTGGGAGTT AATCCAGGAA GATATTCTTG ATACTGGAAA 420 

TGACAAAAAT GGAAAGGAAG AAGTCATAAA GAGAAAAATT CCTTATATTC T6AAAOC3GCA 460 

GCT6TATGAG AATAAACCCA GAAGACCCTA CATACTC3UUV AGAGATTCTT ACTATTACTG S40 

AGAGAATAAA TCATTTATTT ACATGTGATT GTGATTCATC ATCCCTTAAT TAAATATGAA 600 

ATTATATTTG TGTGAAAATG TGACAAACAC ACTTATCTGT CTCTTCTACA ATTGTGGTTT 660 

ATTGAATGTG TTTTTCrGCA CTAATAGAAA TTAGACTAAG TGTTTTC3UU TAAATCTAAA 720 
TCTTCAAAAA AAAAAAAAAA AAATGGGGCC GCAATT 

Seq ZD MO: 517 Protein sequence 
Protein Accession ft: AAB50564 

1 11 ' 21 31 " 41 si 

11)1)1 
MMAGMKIQLV CHZiLIAFSSH SLCSDSEEEM KALSADFLTM MBTSRISKAB VPSUICKTLLN 60 
VCSLVNNLHS PAEETGEVHE EELVARRXLP TALDQFSLEA MLTZYQUOCI CHSRAFQHKE 120 
ItlQEDZLDTG NDKNGKEEVI iOtXIPYILKR QLYENRFRSP YIUCRDSVYY 

Seq ID NO: 516 DNA sequence 

Nucleic Acid Accession §: NM_006S36.2 

'Coding sequence : 10 9 . . 2 94 0 

1 11 21 31 41 51 

I } 1 I i i 

ACCTAAAACC TTGCAAGTTC AGGAAGAAAC CATCTGCATC CATATTGAAA ACCTGACACA 60 

ATGTATGCAG CAGGCTCAGT GTGAGTGAAC TGGAGGCTTC TCTACAACAT GACCCAAAGG 120 

AGCATTGCAG GTCCTATTTG CAACCTGAAG TTTGIGACTC TCCTGGTTGC CTTAAGTTCA 180 

GAACTCCCAT TCC7GGGAGC TGGAGTACAG CTTCAAGACA ATGGGTATAA TGGATTGCTC 240 

ATTGCAATTA ATCCTCAGGT ACCTGAGAAT CAGAACCTCA TCTCAAACAT TAAGGAAATG 300 

ATAACTGAAG CTTCATTTTA CCTATTTAAT GCTACCAAGA GAAGAGTATT TTTCAGAAAT 360 

ATAAAGATTT TAATACCTGC CACATGGAAA GCTAATAATA ACAGCAAAAT AAAACAAGAA 420 

TCATATGAAA AGGCAAATGT CATAGTGACT GACTGGTATG GGGCACATGG AGATGATCCA 480 

TACACCCTAC AATACAGAGG GTGTGGAAAA GAGGGAAAAT ACATTCATTT CACACCTAAT S40 

TTCCTACTGA ATGATAACTT AACAGCTGGC TAOGGATCAC GAGGCOGAGT GTTTGTCCAr 600 

GAATOGGCCC ACCTCCGTTQ GQGTGTGTTC GATGAGTATA ACAATGACAA ACCTTTCTAC 660 

ATAAATGOGC AAAATCAAAT TAAAGTGACA AGGTGTTCAT CTGACATCAC AGGCATTTTT 720 

GTGTGTGAAA AAGGTCCTTG CCCCCAAGAA AACTGTATTA TTAGTAAGCT TTTTAAAGAA 780 

GGATGCACCT TTATCTACAA TAGCACCCAA AATGCAACTG CATCAATAAT GTTCATGCAA 840 

AGTTTATCTT CTGTGGTTGA ATTTTGTAAT GCAAGTACCC ACAACCAAGA AGCACCAAAC 900 

CTACAGAACC AGATG7GCAG CCTCAGAAGT GCATGGGATG TAATCACAGA CTCTGCTGAC 960 
TTTCACCACA GCTTTCCCAT GAATGGGACT GAGCTTCCAC CTCCTCCCAC ATTCTCGCTT " 1020 

GTACAGGCTG GTGACAAAGT GGTCTGTTTA GTGCTGGATG TGTCCAGCAA GATGGCAGAG 1080 

GCTGACAGAC TCCTTCAACT ACAACAAGCC GCAGAATTTT ATTTGATGCA GATTGTTGAA 1140 

ATTCATACCT TCGTGGGCAT TGCCAGTTTC GACAGCAAAG GAGAGATCAG AGCCCAGCTA 1200 

CAOCAAATTA ACAGCAATGA TGATCGAAAG TTGCTGGTTT CATATCTGCC CACCACTGTA 1260 

TCAGCTAAAA CAGACATCAG CATTTGTTCA GGGCTTAAGA AAGGATTTGA GGTGGTTGAA 1320 

AAACTGAATG GAAAAGCTTA TGGCTCTGTG ATGATATTAG TGACCAGCGG AGATGATAAG 1380 

CTTCTTGGCA ATTGCTTAOC CACTGTGCTC AGCAGTGGTT CAACAATTCA CTOCATTGCC 1440 

CTGGGTTCAT CTGCAGCCCC AAATCTGGAG GAATTATCAC GTCTTACAGG AGGTTTAAAG 1500 

TTCTTTGTTC CAGATATATC AAACTCCAAT AGCATGATTG ATGCTTTCAG TAGAATTTCC 1560 

TCTGGAACTG GAGACATTTT CCAGCAACAT ATTCAGCTTG AAAGTACAGG TGAAAATGTC 1620 

AAACCTCACC ATCAATTGAA AAACACAGTG ACTGTGGATA ATACTGTGGG CAAGGACACT 1680 

ATGTTTCTAG TTACQIGGCA GGCCAGTGOT CCTCCT6AGA TTATATTAIT TGATCCTGAT 1740 

GGAGGAAAAT ACTACACAAA TAATTTTATC AOCAATCTAA CTTTTCGGAC AGCTAGTCTT 1800 

TGGATTCCAG GAACAGCTAA GCCTGGGCAC TGGACTTACA CCCTGAACAA TACCCATCAT 1860 

TCTCTGCAAG CCCTGAAAGT GACAGTGACC TCTOGCGCCT CCAACTCAGC TGTGCCCCCA 1920 

GCCACTGTGG AAGCCTTTGT GGAAAGAGAC AGCCTCCATT TTCCTCATCC TGTGATGATT 1980 

TATGCCAATG TGAAACAGGG ATTTTATCCC ATTCTTAATG CCACTGTCAC TGCCACAGTT 2040 

GAGCCAGAGA CTGGAGATCC TGTTACGCTG AGACTCCTTG ATGATGGAGC AGGTGCTGAT 2100 

GTTATAAAAA ATGATGGAAT TTACTCGAGG TATTTTTTCT CCTTTGCTGC AAATGGTAGA 2160 

TATAGCTTGA AAGTGCATGT CAATCACTCT CCCAGCATAA GCACCCCAGC CCACTCTATT 2220 

CCAGGGAGTC ATGCTATGTA TGTACCAGGT TACACAGCAA ACGGTAATAT TCAGATGAAT 2280 

GCTCCAAGGA AATCAGTAGG CAGAAATGAG GAGGAGCGAA AGTGGGGCTT TAGCCGAGTC 2340 

AGCTCAGGAG GCTCCTTTTC AGTGCTGGGA GTTCCAGCTG GCCCCCACCC TGATGTGTTT 2400 

CCACCATGCA AAATTATTGA CCTGGAAGCT GTAAAAGTAG AAGAGGAATT GACCCTATCT 2460 

IGGACAGCAC CTGGftGAAGA CTTTGATCAG GGCCAGGCTA GAAGCIATGA AATAAGAATS 2520 

AGTAAAAGTC TACAGAATAT CCAAGATGAC TTTAACAATG CTATTrTAGT AAATACATCA 2580 

AAGCGAAATC CTCAGCAAGC TGGCATCAOG GAGATATTTA OCTTCTCACC CCAGATTTCC 2640 

AGGAATGGAC CTGAACATCA GCCAAATGGA GAAACACATG AAAGCCACAG AATTTATGTT 2700 

GCAATAOGAO CAATGGATAG GAACTCCTTA CAGTCTGCTG TATCTAACAT TGCCCAOGCG 2760 

CCTCTGTTTA TTCCCCCCAA TTCTGATCCT GTACCT6CCA GAGATTATCT TATATTGAAA 2820 

GGAGTTTTAA CA6CAATGGG TTTGATACGA ATCATTTGCC TTATTATAGT IGTGACACAT 2880 
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CATACTTTAA GCAGGAAAAA GAGAGCASAC AAOAAAfiAGA ATOGAACAAA ATTATTATAA 2940 

ATAAATATCC AAAGTGTCTT CCTTCTTAGA TATAAGACCC ATGGCCTTOG ACTACAAAAA 3000 

CATACTAACA AAGTCAAATT AACATCAAAA CTGTATTAAA ATGCATTGAG TTTTTGTACA 3060 

ATACAGATAA GATTTTTACA TGGTAGATCA ACAATTCTTT TTGGGGGTAG ATTAGAAAAC 3120 

CCTTACACTT TGGCTATGAA CAAATAATAA AAATTATTCT TTAAAGTAAT GTCTTTAAAG 3180 

GCAAAGGGAA GGGTAAAGTC GGACCAGTGT CAAGGAAAGT TTGTTTTATT GAGGTGGAAA 3240 

AATAGCCCCA A6CAGAGAAA AGGAGGGTAG GTCTGCATTA TAACTGTCTG TGTGAAGCAA 3300 

TCATTTAGTT ACTTTGATTA ATTTTTCTTT TCTCCTTATC TGTGCACTAC ACGTTGCTTG 3360 

TTTACATGAA GATCATGCTA TATTTTATAT ATGTAGCCCC TAATGCAAAG CTCTTTACCT 3420 

CTTGCTATTT TGTTATATAT ATTTCAGATG ACATCTCCCT GCTAATGCTC AGAGATCTTT 3480 

TTTCACTGTA AGAGGTAACC TTTAACAATA TGGGTATTAC CTTTGTCTCT TCATACCGGT 3540 

TTTATGACAA AGGTCTATTG AATTTATTTG *nJTGTAAGTT TCTACTCCCA TCAAAGCAGC 3S00 

TTTCTAAGTT TATTGCCTTG GGTTATTATG GAATGATAGT TATAGCCCCN TATAATGCCT 3660 
TACCTAGGAA A 

Seq ID KO: 519 Protein sequence 
Protein Accession S: NP_006527.l 

1 11 21 31 41 SI 

I 1 i I I 1 

MTQRSIAGPI CMLKPVTLLV ALSSELPPLG AGVQLQDHGY KGU*IAINPQ VPENQNLISN 60 

IKEMITEASF YLPMATKRRV FPRNIKILIP ATHKANNNSK IKQSSYBKAN VIVTDWYGAH 120 

GOOPrrLQYS GCGKEGKYIH FTPHFLLNDH LTAGYGSRGR VFVHEWAHLR WGVFDSYNND 180 

KPFYINGGNQ XKVTRCS5DZ T6IFVCEKGP CPQEHCIISK LPKEGCTPIY HSTQNATASI 240 

HPKQSL5SW BPOJASTHNQ BAPNZiQNQMC SLRSAHDVIT DSADPHHSFP MRGTELPPPP 300 

TPSLVQAGDK WCLVUDVSS KMAEADRLIiQ LQQAAEFYLM QIVEIHTPVG lASFDSKGEI 360 

RAQLHQINSN DDRKLLVSYL PTTVSAKTDI SICSGLKKGF EWEKUIGKA YGSVMILVTS 420 

GDDKLLGNCL PTVI^SGSTI HSIALGSSAA PNLEELSRLT GGUCFFVPDI SNSMSMIOAP 480 

SRISSGTGDI FQQHIQLEST GENVKPHHQL KNTVTVDNTV GNDTMFLVTW QASGPPEIIL 540 

FDPDGRKYYT NNFITNLTPR TASLWIPGTA KPGHWTYTl^ NTHHSLQALK VTVTSRASNS 600 

AVPPATVEAF VERDSLUFPH PVHIYAKVXQ GFYPILUATV TATVEPETGO PVTLRLLDQG 660 

AGADVZKNDG lYSRYFFSFA AKGRYSLKVR VNRSP8ISTP AHS2PGSHAM YVPGYTANCM 720 

IQMilAPRKSV GRHBEERKWG FSRV8SGGSF SVLGVPAGPH PDVFPPdCIZ OLEAVKVEBE 780 

LTLSWTAPGE DPDQGQATSY EIRMSKSIiQN IQDDFKNAIL VNTSKRNPQQ AGZREIFTFS 840 

PQISTNGPEH QPNGETHESH RIYVAIRAMD RNSLQSAVSN lAQAPLFIPP HSDPVPAHDY 900 
LILKGVLTAM GLIGIICLII WTHHTLSRK KRADKKEKGT KLL 

Seq ZD NO: 520 DNA secjuence 

MUcIeic Acid Accession ft: MM_000226.l 

Coding sequence: 82.. 3600 

1 11 21 31 41 51 
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GCTTTCAGGC GATCTGGAGA AAGAACGGCA GAACACACAG CAAGGAAAGG TCCTTTCTGG 60 

GGATCACCCC ATTGGCTGAA GATGAGACCA TTCTTCCTCT TGTGTTTTGC CCTGCCTGGC 120 

CTCCTGCATG CCCAACAAGC CTGCTCCCGT GGGGCCTGCT ATCCACCTGT TGGGGACCTG 180 

CTTGTTGGGA GGACCCX3GTT TCTCCXy^CT TCATCTACCT GTGGACTGAC CAAGCCTGAG 240 

ACCTACTGCA CCCAGTATGG CGAGTGGCAG ATGAAATGCT GCAAGTGTGA CTCCAGGCAG 300 

CCTCACAACT ACTACAGTCA CQGA GTAGAG AATGTGGCTT CATCCTCOSG CCCCATGOSC 360 

TGGTGGCAGT GCXAGAATGA IGIGAACOCT GTCTCTCTGC AGCIGGAOCT GGACAGGAGA 420 

TTCCAGCTTC AA6AAGTCAT GATGGAGTTC CAGGGGCCCA TGCCCGC0G6 CATGCTGATT 480 

GAGC3GCTCCT CAGACTTCGG TAAGACCTGG CGAGTGTACC AGTACXTTCGC TGCCGACTGC 54 0 

ACCTCCACCT TCCCTCGGGT COGCCAGGGT CGGCCTCAGA GCTGGCA66A TGTTCGGTGC 600 

CAGTCCCTGC CTCAGAGGCC TAATGCACGC CTAAATGGGG GGAAGGTCCA ACTTAACCTT 660 

ATGGATTTAG TGTCTGGGAT TCCAGCAACT CAAAGTCAAA AAATTCAAGA GGTGGGGGAG 720 

ATCACAAACT TGAGAGTCAA TTTCACCAGG CTGGCCCCTG TGCCCCAAAG GGGCTACCAC 780 

CCTCCCAGCG CCTACTATGC TGTGTCCCAG CTCOGTCTGC AG6G6AGCTG CTTCTGTCaC 840 

GGCCATGCTG ATCGCTGCGC ACCCAAGCCT GGGGCCTCTG CAGGCCCCTC CAG0GCTGT6 900 

CAGGTCCACG ATGTCTGTGT CTGCCAGCAC AACACTGCCG GCCCAAATTG TGAGCGCTGT 960 

GCACCCTTCT ACAACAACCX3 GCCCTGGAGA CCGGCGGAGG GCCAGGACGC CCATGAATGC 1020 

CAAAGGTGCG ACTGCAATGG GCACTCAGAG ACATGTCACT TTGACCCOGC TGTGTTTGCC 1080 

GCCAGCCAGG GGGCATATGG AGGTGTGTGT GACAATTGCC GGGACCACAC 06AAGGCAAG 1140 

AACTGTGAGC GGTGTCAGCT GCACTATTTC GGGAAGCGGC GCC06GGAGC TTCCATTCA6 1200 

GAGACCT6CA TCTCCTGCGA GTGTGATCCG GATGGGGCAG TGCCAGGGGC TGCCTGTGAC 1260 

CCAGTGACCXS GGCAGTGTGT GTGCAAGGAG CATGTGCAGG GAGAGCGCTG TGACCTATGC 1320 

AAGCCGGGCT TCACTGGACT CACCTACGCC AACCCGCAGG GCTGCCACCG CTGTGACTGC 1380 

AACATCCTGG GGTCCCGGAG GGACATGCCG TGTGACGAGG AGAGTGGGOG CTGCCTTTGT 1440 

CTGCCCAACG TGGTGGGTCC CAAATGTCAC CAGTGTGCTC CCTACCACTG GAAGCTGGCC 1500 

AGTGGCCAGG GCTGTGAACC GTGTGCCTGC GACCCGCACA ACTCCCCTCA GCCCACAGTG 1560 

CAACCAGTTC ACAGGGCAGT GCCCTGTCGG GAAGGCTTTG GTCGCCTGAT GTGCAGCGCT 1620 

GCAGCCATCC GCCAGTGTCX: AGACCXK3ACC TATGGAGACG TGGCCACAGG ATGCCGAGCC 1680 

TGTGACTGTG ATTTCCGGGG AACAGAGGGC CCG66CTG0G ACAAGGCATC AGGCOGCTGC 1740 

CTCTGCCGCC CTGGCTTGAC CGGGCCCCGC TGTGACCAGT GCCAGCX5AGG CTACTGCAAT 1800 

CGCTACCCGG TGTGCGTGGC CTGCCACCCT TGCTTCCAGA CCTATGATGC GGACCTCCGG 1860 

GAGCAGGCCC TGCX3CTTTGG TAGACTCCX3C AATGCCACOG CCAGCCTGTG GTCAGGGCCT 1920 

GGGCTGGAGG AOCGTGGCCT GGCCTCCCGG ATCCTAGAT6 CAAAGAGTAA GATTGAGCAG 1980 

ATCCGAGCAG TTCTCAGCAG CCCCGCAGTC ACAGA6CAGG AGGTGGCTCA GGTGGCCAGT 2040 

GCCATCCTCT GCX7CAG6GG AACTCTCCAG GGCCTGCAGC TGGATCTGCC CCTGGAGGAG 2100 

GAGAOGTTGT CCCTTCGGAG AQACCTGGAG AGTCTTGACA GAAGCTTCAA TGGTCTCCTT 2160 

ACTATGTATC AGAGGAAGAG GGAGCAGTTT GAAAAAATAA GCAGTGCTGA TCCTTCAGGA 2220 

GCXTTCCGGA TGCTGAGCAC AGCCTACGAG CAGTCACCCC AGGCTGCTCA GCAGGTCTCC 2280 

GACAGCTCGC GCCTTTTGGA CCAGCTCAGG GACAGCXX3GA GAGAGGCAGA GAGGCTGGTG 2340 

CGGCAGGCGG GAGGAGGAGG AGGCACOGGC AGCCCCAAGC TTGTGGCCCT GAGGCTGGAG 2400 

ATGTCTTGGT TGGCTOACCT GACACOCACC TTCAACAAGC TCTGTGGCAA CTCCAGGCAG 2460 

ATG6CTTGCA COCCAATATC ATGCCCTGGT GAGCTATGTC CCCAAGACAA TGGCACAGCC 2520 

TGTGGCTCCC GCTGCAGGGG TGTCCTTCCC AGG6C0GGTG GGGCCTTCTT GATGG0GGG6 2580 

CAGGTGGCTG AGCAGCTGCG GGGCTTCAA7 GCCCAGCTCC AGGG6ACCAG GCAGATGATT 2640 
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AGGGCAG006 AGGftATCTGC CTGACAGATT CAATOCAGTG GOCAGGGCTT GGAGAXXCAG 2700 

GTGAGOGCCA GOOGCTCOCA GATGraCGAA GATGTCA6AC GCACACQGCT CCTAATCCAG 2760 

CAGGTCCGCG ACTTCCTAAC AGACCOCGAC ACTGATGCAG CCACTATGCA G6AGGTCAGC 2820 

GAGGCOGTGC TGGCCCTGTG GCTGCCCACA GACTCAGCTA CTCrrTCPGCA GAAGATGAAT 2860 

GAGATCCAGG CCATTGC3U3C CAGGCTCCCC AAOGTGGACT TG G T G CTGTC CCAGACCAAG 2940 

CAGGACATTG CGCXTTGCCCG CCGGTTGCAG GCTGAGGCTG AGGAAGCCAG GAGOOGAGCC 3000 

- CATGCAGTGG AGGGCCAGGT GGAAGATGTG GTTGGGAACC TGGGGCAGGG GACAGTGGCA 3060 

CTGCAGCAAG CTCAGGACAC CATGCAAGGC ACCAGCCGCT CCCTTCGGCT TATOCAGGAC 3X20 

AGGGTTGCTG AGGTrCAGCA GGTACTGCGG CCAGCAGAAA AGCTGGTGAC AAGCATGACC 3180 

AAGCAGCTGG GTGACTTCTG GACACGGATG GAGGAGCTCC GCCACCAAGC CCX3GC3U3CAG 3240 

GGGGCAGAGG CAGTCCAGGC CCAGCAGCTT GOGGAAGGTG CCAGCGAGCA GGCATTGAGT 3300 

GGCCAAGAGG GATTTGAGAG AATAAAACAA AAGTATGCTG AGTTGAAOGA CXXXTTTGGGT 3360 

CAGAGTTCCA TGCTGGGTGA GCAGGGTGGC CX3GATCCAGA GTGTGAAGAC AGAGGCAGAG 3420 

GAGCTGTTT6 GGGAGACCAT GGAGAT6ATG GACAGGATGA AAGACATGGA GT7GGAGCTG 3480 

CTGOGGGGCA GCCAGGOCAT CA3X3CTGCGC TGGGOQGACC TGACAGGACT GGAGAAGGGT 3540 

GTGGAGCAGA TCCGTGACCA CATCAATGGG C3G0GTGCTCT ACTATGCCAC CTGCAAGTCA 3600 

TGCTACAGCT TCCAGCCOGT TGCCCCACTC ATCTGCXXCC TTTGCTTTTG GTTGGGGGCA 3 6 SO 

GATTGGGTTG GAAT6CTTTC CATCTCCAGG AGACTTTCAT GCAGCCTAAA GTACAGCCTG 3720 

GACCAGCCCT GGTGTGrTAGC TAGTAAGATT AGCCTGAGCT GCAGCIGAGC CTGAGOCAAT 3780 

GGGACAGTTA CACTTGACAG ACAAAGATG6 TGGAGATTGG CAT6CCATT6 AAACTAAGAG 3840 

CTCTCAAGTC AAGGAAGCTG GGCTGGGCAG TATCOCCOQC CTTTAGTTCT CCACTGGG6A 3900 

GGAATCCTGG ACCAAGCACA AAAACTTAAC AAAAGTGATG TAAAAATGAA AAGOCAAATA 3960 
AAAATCTTTG G 

Seq ID NO: 521 Protein sequence 
Protein Accession #: NI>_000219.X 

1 11 21 31 ' ' 41 51 
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KRPFFLLCFA LPGLLHAQQA CSRGACYPPV GDLLVCmTRF LRASSTCGLT KPBTYCTQYG 60 

EWQKKCCKCD SRQPHNYYSH RVENVASSSG PMRHHQSQND VNFVSLQLDIi DRRPQLQBVM 120 

MEFQGPMFAG MLIERSSDFG KTHRVYgYIA ADCTSTFPRV BQ6RPQSHQD VRGQSXiF(^P ISO 

NARUIGGKVQ LSLKDV^fSQl PATQSQKZQB VGEITNLRVN FTRLAFVFQIt GYKPPSAYYA 240 

VSQLRLQGSC FCH6HADRCA PKPGASAGPS TAVQVHDVCV OQKNTAGPNC ERCAPFYKNR 300 

PWRPAEGQDA RECQRCDCNG HSETCRFDPA VFAASQGAYG GVCDNCRDHT EGKNCEROQL 360 

HYFRNRRPGA SIQETCISCE OPDGAVPGA PGDPVTGQCV CKEHVQGERC DLCKPGFTGL 420 

TYAWPQGCHR CDCNILGSRR DMPCDEESGR CLCLPNWGP KCDQCAPYHH KLASGQGCEP 480 

CACDPHMSPQ PTVQPVHRAV PCREGFGGLM CSAAAIRQCP DRTYGDVATG CRACDCDFRG 540 

TB6PGCDKAS GRCLCRPGLT GPRCDQOQRG YCNRYPVCVA CKPCFQTYDA DLREQALRFG 600 

RUtNATASLH SGPGLEDRGL ASRILDAKSK XEQIRAVLSS PAVTEQEVAQ VASAILSLRR 660 

TLQGLQLDLP LEEETLSLPR DLESLDRSFK GLLTMYQRKR EQFEKISSAD PSGAFRMLST 720 

AYEQSAQAAQ QVSDSSRLLD QLRDSRREAE RLVRQAGGGG GTGSPKLVAL RLEMSSLPDL 780 

TPTFNKLCGN SRQMACTPIS CPGELCPQDN GTACGSRCRG VLPRAGGAFL MAGQVAEQLR 840 

GFKAQLQRTR QMIRAAESSA SQIQSSAQRL BTQVSASRSQ HHEDVRSTRL IiIQQVRDFLT 900 

DPDTDAATIQ CVSEAVtiALW LPTDSATVLQ KMNEIQAIAA RLPNVDLVLS QTKQDXARAR 960 

RLQAEAEEAR SRAHAVEGQV EDWGNLRQG TVALQBAQST HQGTSRSLRIi IQDRVAEVQQ 1020 

VLRPAEKLVT SMTKQLGDFW TRMEELRHQA RQQGABAVQA QQLABGASBQ ALSAQEGPER 1080 

IKQKYAELXD RLGQSSMZjGS QGARIQSVKT BAEELFGBTH EKMDRMKDME LELLRGSQAI 1140 
KLRSADLTGL EKRVEQIRDH INGRVLYYAT CK 

Seq 10 NO: 522 DNA Sequence 

Nucleic Acid Accession #: MM_001944.1 

coding sequence: 84.. 3083 " 
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TTTTCTTAGA CATTAACTGC AGACGGCTGG CAGGATAGAA GCAGOGGCTC ACTTGGACTT 60 

TTTCACCAGG GAAATCAGAG ACAATGATGG GGCTCTTCCC CAGAACTACA GGGGCTCT6G ' 120 

CCATCTTCGT GGTGGTCATA TTGGTTCATG GAGAATT6CG AATAGAGACT AAAGGTCAAT 180 

ATGATGAAGA AGAGATGACT ATGCAACAAG CTAAAAGAAG GCAAAAACXST GAATGGGTGA 240 

AATTTGCCAA ACCCTGCAGA GAAGGAGAAG ATAACTCAAA AAGAAACCCA ATTGCCAAGA 300 

TTACTTCAGA TTACCAAGCA ACCCAGAAAA TCACCTACOS AATCTCTGGA GTGGGAATCG 360 

ATCAGCCGCC TTTTGGAATC TTTGTTGTTG ACAAAAACAC TGGAGATATT AACATAACAG 420 

CTATAGTCGA CCGGGAGGAA ACTCCAAGCT TCCTGATCAC ATGTCGGGCT CTAAATGCCC 480 

AAGGACTAGA TGTAQAGAAA CCACTTATAC TAACG6TTAA AATTTTGGAT ATTAATGATA 540 

ATCCTOCAGT ATTTTCACAA CAAATTTTCA TGQ6TGAAAT TGAAGAAAAT AGTGCCTC3VA 600 

ACTCACTGGT GATGATACTA AATGCCACAG ATGCA6ATGA ACCAAACCAC TTGAATTCTA 660 

AAATTGCCTT CAAAATTGTC TCTCAOGAAC CAGCAGGCAC ACCCATGTTC CTCCTAAGCA 720 

GAAACACTGG GGAAGTCCGT ACTTTGACCA ATTCTCTTGA CCGAGAGCAA GCTAGCAGCT 780 

ATCGTCTGGT TGTGAGTGGT GCAGACAAAG ATGGAGAAGG ACTATCAACT CAATGTGAAT 840 

GTAATATTAA AGTGAAAGAT GTCAAC6ATA ACTTCCCAAT GTTTAGAGAC TCTCAGTATT 900 

CAGCAOGTAT TGAAGAAAAT ATTTTAAGTT CT6AATTACT TCGATTTCAA GTAACAGATT 960 

TGGATGAAGA GTACACAGAT AATTGGCTTG CAGTATATTT CTTTACCTCT GGGAATGAAG 1020 

GAAATTGGTT TGAAATACAA ACTGATCCTA GAACTAATGA AGGCATCCTG AAAGTGGTGA 1080 

AGGCTCTAGA TTATGAACAA CTACAAAGOS TGAAACTTAG TATTGCTGTC AAAAACAAAG 1140 

CTGAATTTCA CCAATCAGTT ATCTCTOGAT ACCGAGTTCA GTCAACCCCA GTCACAATTC 1200 

AGGTAATAAA TGTAAGAGAA GGAATTGCAT TCCGTCCTGC TTCCAAGACA TTTACTGTGC 1260 

AAAAAGGCAT AAGTAGCAAA AAATTGGTGG ATTATATCCT GGGAACATAT CAAGCCATC36 1320 

AT6AGGACAC TAACAAAGCT GCCTCAAATG TCAAATATGT CATGGGACGT AACGATGGTG 1380 

GATACC TAAT GATTGATTCA AAAACTGCTG AAATCAAATT TGTCAAAAAT ATGAACCGAG 1440 

ATTCTACTTT CATAGTTAAC AAAACAATCA CAGCTGAGGT TCTGGCCATA GATGAATACA 1500 

CGGGTAAAAC TTCTACAGGC ACXSGTATATG TTAGAGTACC CGATTTCAAT GACAATTGTC . 1560 

CAACAGCTGT CCTCGAAAAA GATGCAGTTT GCAGTTCTTC ACCTTOOGTG GTTGTCTCCG 1620 

CTAGAACACT GAATAATAGA TACACTGGCC CCTATACATT TGCACTGGAA GATCAACCTG 1680 

TAAAGTTGCC TGCCGTATGG AGTATCACAA CCCTCAATGC TACCTCGGCC CTCCTCAGAG 1740 

CCCAGQAACA GATACCTCCT GGAGTATACC ACATCTCCCT GGTACTTACA GACAGTCAGA 1800 

ACAATCGGTG TGAGATGCCA GGCAGCTTGA CACTG6AA6T CTGTCAGTGT GACAACAGGO 1860 



384 



wo 02/086443 

GCATCTGTGG AACTTCTTAC OCAACCACAA G0CCTGG6AC CAGGTATGGC A GGOOG OCT 1920 

CAGGGAGGCT GGGGCCTGCC GOCATOGGCC TGCTGCTCCT TGGTCTCCTG CTGCTGCTGT 1980 

TGGCCCCCCT TCTGCTGTTG ACCTGTGACT GTGGGGCAGG rrCTACTGGG GGAGTGACAG 2040 

GTGGTTTTAT CCCAGTTCCT GATGGCTCAG AAGGAACAAT TCATCAGTGG GGAATTGAAG 2100 

GAGOCCATCC TGAAGACAAG GAAATCACAA ATATTTGTGT GCCTCCTGTA ACAGCCAATG 2160 

GAGGCBATTT CATGGAAAGT TCTGAAQTTT GTACftAATAC GTATGCCAGA GGCRCftGOGG 2220 

TGGAAOGCAC TTCAGGAATG GAAATGACGA CTAAGCT7GG AGCAGCCACT GAATCTGGAG 2280 

GTGCPGCAQG CTTTGCAACA GGGACAGTGT CAGGAGCTGC TTCftGGATTC GGAGCAGCCA 2340 

CTGGAGTTGG CATCTGTTCC TCAGGGCAGT CTGGAACCAT GAGAACAAGG CATTCCACTG 2400 

GAGGAACCAA TAAGGACTAC GCTGATGGGG OQATAAGCAT GAATTTTCTG GACTCCTACT 2460 

TTTCTCAGAA AGCATTTGCC TGTGOGGAGG AAGACGATGG CCAGGAAGCA AATGACTGCT 2520 

TGTTOATCTA TGATAATGAA GGCGCAGATG CCACTGGTTC TCCTGTGGGC TCOGTGGGTT 2580 

GTTGCAGTTT TATTGCTGAT GACCTGGATG ACAGCTTCTT GGACTCACTT GGACCCAAAT 2640 

TTAAAAAACT TGCAGAGATA AGCCTTGGTG TTGATGGTGA AGGCAAAGAA GTTCAGOCAC 2700 

CXrrCTAAAGA CAGOGGTTAT GGGATTGAAT OCTGTGGCCA TCCCATAGAA GTCCAGOGA 2760 

CAGGATTTGT TAAGTGCCAG ACTTTGTCAG GAAGTCAAGG AGCTTCTGCP TTGTOOGCCT 2820 

CTGGGTCTGT CCAGCCAGCT GTTTCCATCC CTGACCCTCT GCAGCATGGT AACTATTTAG 2880 

TAAOGGAGAC TTACTOGGCT TCTGGTTCCC TCGTGCAACC TTCCACTGCA GGCTTTGATC 2940 

CACTTCTCAC ACAAAATCTG ATAGTGACAG AAAGGGTGAT CTGTCCCATT TOCAGTGTTC 3000 

CTGGCAACCT AGCTGGCOCA AOGCAGCTAC GAGGGTCACA TACTATGCTC TGTACAGAGG 3060 

ATCCTTGCTC COGTCTAATA TGACCAGAAT GAGCTGGAAT ACCACACTGA CCAAATCTGG 3120 

ATCTTTGGAC TAAAGTATTC AAAATAGCAT AGCAAAGCTC ACTGTATTGG GCTAATAATT 3180 

TCGCACTTAT TAGCTTCTCT CATAAACTGA TCAOGATTAT AAATTAAATG TTTGGGTTCA 3240 

TACCCCAAAA GCftATATGTT GTCACTCCTA ATTCTCAAGT ACTATTCAAA TTGTAGTAAA 3300 
TCTTAAAGTT TTTCAAAAOC CTAAAATCAT ATTOQC 

Seq ID NO: 52 3 Procein sequence 
Protein Accession #: NP_00l93S.l 

1 11 21 31 41 51 

111)11 

MKGIiFPRTTG ALAIPVWIL VHGELRIBTK GQYDEEIXTM QQAKRRQKRE HVKPAKPCRE 60 

G5DNSXRNPI AKITSDYQAT QKITYRISGV GIDQPPFGIP WDXNTGDIN ITAIVDREBT 120 

PSPLITCRAL NAQGIiDVEKP LIIjTVKIIiDI NDNPPVFSC3Q IPMGEIEENS ASNSLVMILN 180 

ATDADEPNHL NSKIAPKIVS QEPAGTPMPI* LSRNTGEVRT LTNSLDREQA SS YRLWSGA 240 

DXDGEGLSTQ CEOJIKVKDV NDNPPMPRDS QYSAHIEENI LSSEUjRFQV TDLDEEYTDN 300 

WLAVYPPTSG NBGMWPBIQT DPRTNBGILK WKAtiOyEQL QSVKLSIAVK NKAEFHQSVI 360 

SRYRVQSTPV TIQVINVREG lAFRPASKTP TVQKGISSKK LVDYILOTYQ AXDEDTNKAA 420 

SNVKYVMGRN DGGYLMIDSK TAEIKFVKNM NHDSTPIVNK TITAEVIAIO EYTGBCTSTGT 480 

VYVRVPDFND NCPTAVLEKD AVCSSSPSW VSARTLNNRY TGPYTFALED QPVKLPAVHS 540 

ITTLNATSAL LRAQEQIPPG VYHISLVLTD SQNNRCEMPR SLTLEVCQCD NRGIOGTSYP 600 

TTSPGTRYGR PHSGRLGPAA IGLLLLGTiT.T. LIiLAPLLLliT CDCGAGSTGG VTGGPIPVPD 660 

GSEGTIKQWG lEGAHPEDKE ITNICVPPVT ANGADFMESS BVCTNTYARG TAVEGTSGMB 720 

MTTKLGAATE SGGAAGFATG TVSGAASGFG AATGVGICSS GQSGTMRTRH STGGTNXDYA 780 

D6AISMNFLD SYFSQKAFAC AEEDDGQBAM DCLLXYQlilEG ADATGSPVGS VGCCSFZADD 640 

LDQSFLDSLG PXFKKLAGIS LGVD6BGKEV QPPSKDSGYG IBSOGHPIBV QQTGFVKCQT 900 

LSGSQGASAL SASGSVQPAV SIPOPLQKGK YLVTETYSAS GSLVQPSTAG FDPLLTQNVZ 960 
VTERVICPIS SVPGNLA6PT QIJiGSHTNLC TEDFCSRLZ 

Seq ID NO: 524 DIlA Sequence 

Nucleic Acid Accession #: XM_058069.2 

Coding sequence : 1 . . 14 13 

1 11 21 31 41 51 

i i i i I 1 

ATGAAGTTTC TTCTAATACT GCTCCTGCAG GCCACTGCTT CTGGAGCTCT TCCCCTGAAC 60 

AGCTCTACAA GCCTGGAAAA AAATAATGTG CTATTTGGTG AAAGATACTT AGAAAAATTT 120 

TATGGCCTTG AGATAAACAA ACTTCCAGTG ACAAAAATGA AATATAGTGG AAACTTAATG 180 

AAGGAAAAAA TCCAAGAAAT GCAGCACTTC TTGGGTCTGA AAGTGACCOG GCAACTGGAC 240 

ACATCTACCC TGGAGATGAT GCACX3CACCT CGATGTGGAG TCCCCGATGT CCATCATTTC 300 

AGGGAAATGC CAGGGGGGCC CX3TATGGAGG AAACATTATA TCACCTACAG AATCAATAAT 360 

TACACACCTG ACATGAACCG TGAGGATGTT GACTACGCAA TCCGGAAAGC TTTCCAAGTA 420 

T6GAGTAATG TTACCCCCTT GAAATTCAGC AAGATTAACA CAGGCATGGC TGACAnTTG 4 80 

G'lUG rn T I G CCCGTGGAGC TCATGGAGAC TTCCATGCTT TTGATGGCAA AGGTGGAATC 540 

CTAGCCCATG CTTTTGGACC TGGATCTGGC ATTGGAGGGG ATGCACATTT OGATGAQGAC 600 

GAATTCTGGA CTACACATTC AGGAGGCACA AACTTGTTCC TCACTGCTGT TCACGAGATT 6 SO 

GGCCATTCCT TAGGTCTTGG CCATTCTAGT GATCCAAAGG CCGTAATGTT CCCCACCTAC 720 

AAATATGTTG ACATCAACAC ATTTCGCCTC TCTGCTGATG ACATACGTGG CATTCAGTCC 780 

CTGTATGGAG ACCCAAAAGA GAACCAACGC TTGOCAAATC CTGACAATTC ACAACCAGCT 840 

CTCTGTGACC CCAATTTGAG TTTT G ATGCT GTCACTACOG TGGGAAATAA OATCTTTTTC 900 

TTCAAAGACA GGTTCTTCTG GCTGAAGGTT TCT6AGA6AC CAAAGACCAG T6TTAATTTA 960 

ATTTCTTCCT TATGGCCAAC CTTGCCATCT GGCATTGAAG CTGCTTATGA AATTGAAGCC 1020 

AGAAATCAAG riTrrCl ' m ' TAAAGATGAC AAATACTGGT TAATTAGCAA TTTAAGACCA 1080 

GAGCCAAATT ATCCCAAGAG CATACATTCT TTTGGTTTTC CTAACTTTGT GAAAAAAATT 1140 

GATGCAGCTG TTTTTAACCC ACGTTTTTAT AGGACCTACT TCTTTGTAGA TAACCAGTAT 1200 

TGGAGGTATG ATGAAAGGAG ACAGATGATG GACCCTGGTT ATCCCAAACT GATTACCAAG 1260 

AACTTCCAAG GAATCGGGCX: TAAAATTGAT GCAGTCTTCT ACTCTAAAAA CAAATACTAC 1320 

TATTTCTTCC AAGGATCTAA CCAATTTGAA TATGACTTOC TACTGCAROG TATCAOCAAA 1380 
ACACTGAAAA GCAATAGCTG GTTTGGTTGT TGA 

Seq ID NO: 525 Protein sequence 
Protein Accession 8: P39900 

1 11 21 31 41 51 

I I I I- I I 

MKFLLZLLLQ ATASGALPLN SSTSLEKNHV IiFGERYLEKF YGLEINKIiPV TIMKySGNLK 60 

KEKXQEMQHF LGLKVTGQLO TSTIiE»4IIAP SOGVPDVHHF REMPGGFVHR RBYITYRINK 120 
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rmsomEDv oyairkafqv waivTPLKFS KurrGKADiL wpargaho? feafdgsggi leo 

liAEAFGPGSG IGGDARFDEO EFHTTHSGGT KLPLmVHEI CSSUGUSESS DPKAVMFPTY 240 

KYVDINTPRL SADOIRGIQS LYGDPK£NQR LPNPDNSSPA LCDPNLSFDA VTTVGKKIPF 300 

FKDRPFWLKV SERPKTSVNL ISSI^n>TLPS GIEAAyEICA RNQVFLFKDD KVWLISML&P 360 

EPKYPKSIHS FGFPNPVKKI DAAVFSPRFY RTYFFVDKQY HRYDERRQXM DPGYPKLITK 420 
NFQ6IGFXID AVFYSHNKYY YFFQGSNQFS yDFLLQRXTK TLXSNSWFGC 

Seq ID NO: 526 DNA sequence 
£!ucleic Acid Accession S: NN_024423.1 
Coding sequence: 64.. 2590 
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1 11 

I I 

GGCAGGTCTC GCTCTCGGCA 
15 COGATGGCCG COGCTGGGCC 
CTGACCCTCG TGATCTTCAG 
CCTTCTAAAC TAGAGGCAGA 
TCTGCAGACC TCATCGGGrTC 
TACACAGCCA GG6CTGTTGC 
20 GACAAAAGGA AACAGACACA 
TCGAAGACAA GACACACTAG 
ATTCCTTGCT CTATGCAAGA 
GAATCTGATG CAGCACAGAA 
_ _ AAAGAACCIT TAAATTTGTT 
25 OCTGTGGATC GTGAAGAATA 
GGATATTCAG CAGATCTGCC 
.CACXXTGTTT TCACAGAAGC 
ACTACAGTGG GGGTGGTTrG 
CTGAAATACA GCATTTTGCA 
30 AGCACAGGCG TAATCACCAC 
TCATTGATAA TGAAAGTACA 
ACTTGTATCA TAACAGTAAC 
TATGAAGCAT TTGTASAGGA 
GATAAGGATT TAATTAACAC 
35 GAAAATGGAC ATTTCAAAAT 
GTAAAGCCAC TGAATTATGA 
GAAGCSCCAT TTGCTAGAGA 
GTTCATGTGA GGGATCTGGA 
ATTAAAGAAA ACTTAGCAGT 
40 AATAGAAATG GCAATGGTTT 
ATTGATGAAA TTTCAGGGTC 
CCCAAAAATG AGTTGTATAA 
ACTGGAACAC TTGCTGTGAA 
GAATATGTAG TCATTTGCAA 
45 GATGAACCTG TCCATGGAGC 
AGTAGACTGT GGAGCXTPCAC 
AATGCTGGAT TTCAAGAATA 
GCAACAAAAT TATTGAGAGT 
ACTTCAAGGA GTACAGGACT 
50 ATAGCACTGC TCTTTTCIGT 
GGGAAAOGTT TTOCTGAAGA 
CCTGGAGACG ATAGAGTGTG 
AGCCAAGGTT TTT6TGQTAC 
GAAAtGATGA AAG6AGGAAA 
55 ACCCTGGACT CCTGCAC5GGG 
GAGTGGCACA GTTTTACTCA 
TAAAAATTAA ACATAAAAGA 
OCAAGATTAT GTGCTCACTT 
CTGCTGCA6T GAAAAGCAGG 
60 ATTTATTACA TTAGCAGAAG 
TTTGTCAGAC ATTCTGGAGG 
TGTATATGAT GATTTTTTTC 
AGCCAGTTGT TGCTTATCTT 
- TCTCAAACTC CAGCACTGGA 

05 GATATTTTAG TAATAAATAT 
ATATCACATT ATTATGTATT 
AGTATCACTA TGTGAAGAAA 
ATGTTGCAGC TCATAAAGAA 
TTGGAGGCAA AATGTGTTGA 
70 GGAAATAAAT GTGTGTGTGT 
AACAAAGAG6 AAAATOGTAA 
GAAAAAA6AG AGAGCTTGCT 
GA6GAAATAG TTCCTGTCCA 
TTCTGGTTTC TGTGGGAAGG 
75 CTGTTTCAAG ATTTCTGCAT 
GGGGTTOCCT GCTTTTTGGT 
ATAACAAAAA CATTTTAAAA 
TTCTCTCTTA TAGTGACCAA 
_ _ AGTTTAGAGG CTAGAGGGAG 
80 ATTGTCCTTA AACCTAAGCC 
. . TTTCATTTTT CTCCTCACTG 
AGGCCTTGTG GGCCCXICTTC 
CCTTAAGTGA CTCCAGGTTT 
Q _ CTTTCTCCAG AGAAATTTTA 
85 AAAGATCAAG TTGTCATTTT 
TATTTGTACA GTCAGAGGGC 
AAGGAATAT6 GGTGGGAGTA 



21 31 41 

] i I 

CCCTCCCGGC GCCCGOGTTC 
CCGGCGCTCC GTGCX3CGGAG 
TCGTGATGGT GAAGCCTGCA 
CAAAATAATT GGCAGAGTTA 
AAGTGATOCT GATTTCAGAG 
GCTGTCTGAT AAGAAAAGAT 
GAAAGAGGTT ACTGTGCTGC 
AGAAACTGTT CTCAGGCX?rG 
GAATTCCTTG GGCCCTTTCC 
CTATACTGTC TTCTACTCAA 
TT ATATAGAA A GAGAC ACTG 
TGATGTTTTT GATTTGATTG 
CCTCCCACTA CCCATCAGGG 
AATTTATAAT TTTGAAGTTT 
TGCCACAGAC AGAGATGAAC 
GCAGACACCA AGGTCACCTG 
AGTCTCTCAT TATTTGGACA 
AGACATGGAT GGCCAGTTTT 
AGATTCAAAT GATAATGCAC 
AAATGCATTC AATGTGGAAA 
TGCCAATTGG AGAGTCAATT 
CAGCACAGAC AAAGAAACTA 
AGAAAACCX?r CAAGTGAACC 
TATTCCCAGA GrGACAG^T 
TGAGGGGOCT GAATGCACTC 
GGGGTCAAAG ATCAACGGCT 
AAGGTACAAA AAATTGCATG 
AATCATAACT TCCAAAATCC 
TATTACAGTC CTGGCAATAG 
CATTGAAGAT GTAAATGATA 
ACCAAAAATG GGGTArTACCG 
TCCATTTTAT TTCAGTTTGC 
CAAAGTTAAT GATACAGCTG 
TACCATTCCP ATTACTGTAA 
TAATCTGTGT GAATGTACTC 
AATACTTGGA AAATGGGCAA 
ATTGCTAACT TTAGTATGTG 
TTTAGCACAG CAAAACTTAA 
CTCTGCCAAT GGATTTATGA 
TATGGGATCA GGAATGAAAA 
CCAGACCTTG GAATCCTGCC 
AGGACACACG GAGGTGGACA 
ACCCOSTCTC GGTGAAGAAT 
AATTGCATCG ATGTAATCAG 
ATAACTATGA GGGAAGAGGA 
AAGAAGATGG CCTTGACTTT 
CATGCACAAA GAGATAATGT 
TTTCCAAAAA TAATATTGTA 
TCAATTTTGA ATTATGCTAC 
TTCCAAAAAG TQAAAAATGT 
ATTAAGGTCT CTAAAGCATC 
GCTGGATAAA TATTAGTCCA 
CACTTTAAGT GATAGTTTAA 
GTTTTGGAAA AGAAACAATG 
TTGGGACTCA CCCCTACTGC 
AGTGCCCTAT GAAGTAGCAA 
ATA TTAT TAT TAATCAATGC 
AAACTTGAAA T6AGGCTGGG 
AGGCCTGGGC TCTTAAATGC 
ATTTGTGTAA TTTGTTTAAA 
AAATAGGGAA TCCAATGGAA 
CCACAAGTTA GTAGCAAACT 
AGCAAGGGTC CAGAGATGAG 
CTTACCTTTA CTGAAGTTAA 
CATCTTTTTA ATTTAGATCC 
CTGAGGGGAG GATCTTACTG 
CCACAAACTT GACACCTGAT 
CCCTTCTTCT GAGTGGCATT 
TTTCGGCTTT CTGCTAAAGC 
TCXACCATCC TTCAGOGTGA 
AAATAATAGA AGAAATAOAA 
AGAACAGAGG GAACTTTGGG 
AACAGGAAGA TGCAGGCCTT 
AAAGCAACAT 



51 
I 



TCCTGGCCCT GCCOGGCATC 60 

CCGTCTGCCT GCATCTGCTG 120 

AAAAGGTGAT ACTTAATGTA 180 

ATTTGGAAGA GTGCTTCAGG 240 

TTCTAAATGA TGGGTCAGTG 300 

CATTTACCAT ATG6CTTTCT 360 

TA6AACATCA GAAGAAGGTA 420 

CCAAGAGGAG ATGG6CACCT 480 

CATTGTTTCr TCAACAAGTT 540 

TAAGTGGAOG TGGAGTTGAT 600 

6AAATCTATT TTGCACTOGG 660 

CTTATGCGTC AACTGCAGAT 720 

TAGAGGATGA AAATGACAAC 780 

TGGAAAGTAG TAGACXTTCGT 840 

CGGACACAAT GCATAOSOSC ' 900 

GGCTCTTTTC TGTGCATCCC 960 

GAGAGGrrGT AGACAAGTAC 1020 

TTGGATTGAT AGGCACATCA 1060 

CXrACTTTCAG ACAAAATGCT 1140 

TCTTACGAAT ACCTATAGAA 1200 

TTACCATTTT AAAGGGAAAT 1260 

ATGAAGGTGT TCTTTCTGTT 1320 

TGGAAATTGG AGTAAACAAT 1380 

TGAACAGAGC CTTGGTTACA 1440 

CTGCAGCCCA ATATGTGOGG 1500 

ATAAGGCATA TGACCCCGAA 1S60 

ATCCTAAAGG TTGGATCACC 1620 

TGGATAGGGA GGTTGAAACT 1680 

ACAAAGATGA TAGATCATGT 1740 

ATCCACCAGA AATACTTCAA 1800 

ACATTTTAGC TGTT6ATCCT 1860 

OCAATACTTC TCCAGAAATC 1920 

CCOGTCTTTC ATATCAGAAA 1980 

AAGACAGGGC CX3GCCAAGCT 2040 

ATCCAACTCA GTGTOGTGCG 2100 

TCCTTGC3UVT ATTACTGGGT 2160 

GAGTTTrTGG TGGAACTAAA 2220 

TTATATCAAA CACAGAAGCA 2280 

CCCAAACTAC CAACAACTCT 2340 

ATGGAGGGCA GGAAACCATT 2400 

GGGGGGCTGG 6CATCATCAT 2460 

ACTGCAGATA CACTTACTCX5 2520 

CCATTAGAGG ACACACTGGT 2580 

AATGAAGACC GCATGCCATC 2640 

TCTCCAGCTG GTTCTGTGGG 2700 

TTAAATAATT TGGAACGCAA 2760 

aVCAGTGCTA CAATTAGGTC 2820 

AAGTTCAATT TCAACATGTA 2880 

TCACCAATTT ATATTTTTAA 2940 

TAAAACAGAC AACTGGTAAA 3000 

TGCTCTTTTT TTTTTTTAGG 3060 

ACAATAGCTA AGTTATGCTA 3120 

AAAATAAACA AGAAATATTG 3180 

AAGACTGAAT TAAATTAAAA 3240 

ACTACCAAAT TCATTTGACT 3300 

TTTTCTATAG GAATATAGTT 3360 

AATATTTAAA ATGAAATGAG 3420 

6TATA6TTTG TGCTACAATA 3480 

TGCATTATAA CTGAGTCTAT 3540 

ATTGTAAATA AATTAAACTT 3600 

CAGTAGCTTT GCTTTGCAGT 3660 

6GGGAATACT CGCTGCAGCT 3720 

t ^lXir n'V l l t OGGGGAGCTA 3780 

ATCCTCTATT GCTGTTTCTA 3840 

AAATAACCAT GTCCTCCTAG 3900 

AAA6CACCCT 0GGGAGATT6 3960 

CAGGTCTGGG AGCTACAAAA 4020 

GGCCTGAATC AAG6AAAGCC 4080 

AACACCTCCA GCAGAGATTC 4140 

ATTAATTTTT AATCAGTTTG 4200 

ATTTTGAATG TATAAAAGAA 4260 

AGAAAGCAGC CCAA6TAGGT 4320 

CAACGGCAAG GAGAGGCCAC 4380 

ATACTTTTTC CTAGGCTTGG 4440 
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CACTGCCTTT TCCTTTCrCA 
OCAACCTCTT CTCEATGGCT 
TGCATGATGA GTCTGAAGGC 
AGGCATTCAT GGGAATTGTT 
CTATGRATTA AATGC CTATC 
TGACCCTAAA ATCTATGTGT 
TTGAGAOGGA GTCTCGCTCT 
CTGAAAGCrc CGCCTCCCGG 
GACTACAGGC GCCCACCACC 
TTCACTGTGT TAGCCAGGAT 
TCCCAAAGTG CTGGGATTAC 
GTOGTCTTCT TTTAATGTAA 
TCAATCTTGA AATACTCAAC 
GCACAAAATA TTGGT CTGAG 
TGTAACCAGA AGCCAGTTTT 
CCCACTCACC GATCAAAACC 
TCAAAGAGCA ACCAGTATCA 
TGAACATGCT GAAAACCACC 
AAATGAAAAT TTAATTTTAG 
TCCTTATATG TGTAAGGTGA 
AGCTTTCATT TTTCCCCCAG 
TTTTCTTACT TTTATAACGA 
TTTAAACAGA GTTTTAGTAT 
CTGCTTAAAA TAAGCA AAAA 
AATAAAACAA TATTAACTTG 
TTTACAGATG TGGGGAGATG 
•mAGAGATT .AAATAATTCT 
GAAATAGAAA TACTCAATTA 
TCATTATCAA ATTGTCXSACA 
TTGAAGCACA GCTTTACAGA 
GTATTAAAAG TA TTAGAA GG 
ACAGGGGTTT TACTTT GAGG 
CAGGCAATAT TGCAGTCTTG 
GACAAGATGA TCCAACCATA 
GGAGTGTGCT CCCCTACAAA 
AAAGCCTTAC ATTTTAATAT 
ACCATTATTT TTGTGTATGT 
ATACCGGATA CATTTCACGT 
GTTGAGAAGC ATGGACACTA 
ACTTCTGTGT GACCTTTGAA 
ATGAACAATG CCAGCCTCAT 
ACATAGAACA CTGCCTGCAC 
ATGTAGTTGG ATATACTACC 
CATATATATA ATCCCGAAAC 
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GGOCAATGGC 
CACCTTATTT 
ATTTGCAGGA 
GTATTCCTTC 
TAAAATTCTG 
TTTAGACTTA 
GA06CAGA0G 
GTTCATGCCA 
ACGCCCGGCT 
GGTCTCGATC 
AGGCATGACC 
TCATTTTGAA 
CAAAAGACAG 
AATGGAATTC 
ATCTAACX3GC 
TGCTACCTCC 
CTTCCCTGTT 
TGGTCTGCAT 
GGATTCATTT 
AATTPATGGT 
TGAATGATTT 
AGCAGCTGTC 
TGCTATTAAA 
TTGGATGCAT 
GCTGCTTAAA 
TAATAAAACA 
AAGATGATCA 
TGTCTTTGTT 
TCATTAATAT 
TGAGTATCTA 
TGGTTATAAT 
ACCAGTGTAG 
ATTCTGCCAC 
AAGGTGCTCT 
CGTTAAGACT 
AGGTTGAACC 
CTTCAAGAAT 
GTCCTTCAGT 
GAGCCAGAAT 
AGGCTACTTA 
GGGGTTGTTG 
ATAGTAAAAG 
GAACAATATC 
ATG 



AACT6CCATT 
GGAGTGAGAA 
TGAGCCTGAA 
TGCAGCCCTC 
ATTTATTCCr 
GACTTTTTAT 
CTGGAGTGCA 
TTCTCCTGCC 
AATTTTTTGT 
TCCTGACCTC 
CACCGCTCCC 
CATGTGTGAA 
TGGAGAAGCC 
TCTGTAAGCC 
TACTGA AACA 
CCAAGACTTT 
TATAAAACCT 
GTATGCCOGA 
CTATATTTTC 
ATTT GAGTGT 
AGAATTTTTT 
TAAAATGCAG 
AGAAGTTACT 
AAAGTAATAT 
ATAAGCAAAA 
ATATTAACTT 
CTTTGCAAAA 
GTATTAATGG 
ATATTGTAAT 
TGATACATAT 
TGCAGAGTAT 
TCAAGGGAAA 
TTACAGGATA 
GTGCTTCACA 
GATCATTTCA 
AAAATTTCAA 
GTTCATTGGA 
ATTGATTTGG 
GCTTGGATAT 
TTTCCTCTCT 
AATGATTAAA 
AATTATAAGT 
TAATCTCTTT 



TGAGTCOGGT 
ATCAAGGAGA 
CTOGTIGTGC 
CTTCTGGGCA 
ACATTTTCTG 
TGOCCCCCCC 
GTGGCTCCGA 
TCAGCCTCCT 
ATTTTTAATA 
GTGATCOGCC 
GGCCTTGTTT 
AGTTGATCAT 
AGGGGGAGAA 
TAGrPGCTGA 
CCCACTGTGT 
ACTAGTGCCG 
CTAACCATCT 
ATTTGTAATT 
ACATATGTAG 
GCAAGAAAAT 
ATGTAAATAT 
TGGGGTTTGT 
TTGCTTTTAA 
TTACAGATGT 
ATTGGATGCA 
GGTTTCTTGT 
TTATGCTTAT 
GGAATATTTT 
GTIGGGAACA 
GTATAATAAA 
TCCATGAATA 
ACATGAGTTA 
GATAAT GCCT 
GTGAATCTTT 
AAAATCTATT 
TTCCAGTAAC 
TTTTTGTTTG 
TTGAATATTG 
GAATCCTGGA 
TAGCTTTCTC 
TTAGTTAATA 
GTGAGGTAGT 
TTAGGGAAAT 



GAGGGATCAG 
CAGAGCTGAC 
AGAACAAACA 
CTAAGAAGGT 
TTTTCTAATT 
CCCTTTTTTT 
TCTCIGCTCA 
GAGTAGCTGG 
GAGAOGGGGT 
TGCCTOGGCC 
TCCGTT7AAA 
ACGAATTGGA 
AGAACrCASG 
AATTTCCTGC 
TTTGCTCACT 
ATAAACTTTC 
CTTTGTTCTT 
CTTTTCTCTC 
TATTATTATT 
ATATTTTTAA 
ACAGAATGTT 
TTTGCAATGT 
AGAAACTTGG 
GGGGASATGT 
TAAAGTAATA 
TTTTGCTGTA 
GGCTGGCATG 
G6ACAATGTT 
GATCACTATT 
TTTTGATOGG 
GTACACTGAC 
AAAAGAAAAG 
GAACTTTAAT 
TCX30CATGCA 
AGCTATATCA 
TTCTATTGTA 
TAATAGTAAA 
GGTCATAATG 
TCTGTCACTT 
ATTAAAATCA 
TACCTAAAGT 
TGGTAAAA7T 
AAAGTTTGTG 
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Protein Accession ft: NP_07774l.l 
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MAAAGPRRSV 
ADLIRSSOPD 
KTRHTRETVL 
BPLMLFYIER 
PVPTEAIYNP 
TGVITTVSHY 
EAFVEENAFN 
KPUJYEENRO 
KENIiAVGSKI 
KNELYNITVI* 
EPVHGAPFYP 
TKLLRVIILCE 
KRFPEDLAQQ 
MMKGGIJQTLE 
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RGAVCLHLLL 
FRVLNDGSVY 
RRAKRRWAPI 
DTGNLFCTRP 
EVLESSRPGT 
LDREVATOKYS 
VEILRIPIED 
VKLEIGVNNE 
NGYKAYDPEN 
AIDKDDRSCT 
SLPNTSPEIS 
CTHPTQCRAT 
NLIISNTEAP 
SCRGAGHHHT 



21 
1 

TLVIPSRDGE 
TARAVALSDK 
PCSMQENSLG 
VDREEYDVFD 
TVGWCATDR 
LIMKVQDMDG 
KOLINTANWR 
APFARDIPRV 
RNGNGIiRYKK 
GTLAVNIEDV 
RLWSLTKVND 
SRSTGVILGK 
GDDRVCSANG 
LDSCRGGHTE 



31 
1 

ACKKVILNVP 
KRSFTIHLSD 
PFPLFMJQVE 
LIAYASTADG 
DEPDTMHTRL 
QFFGLIGTST 
VNFTILKGNE 
TALNRAIiVTV 
LHDPKGWITI 
MDNPPEI1X2E 
TAARLSYQKN 
WAILAILLGI 
FMTQTTNNSS 
VDNCRYTYSE 



41 

1 

SKLEADKIIG 
KRKQTQKEVT 
SDAAQNYTVF 
YSADLPIiPLP 
KYSILQQfTPR 
CIITVTDSND 
NGHPKISTOK 
HVRDLDEGPB 
OBISGSIITS 
YWICKPKMG 
AGFQEYTIPI 
ALLPSVLLTL 
QGFCGTMGSG 
WHSFTQPRIX5 



51 

1 

RVNXjEECFRS 
VULEHQKKVS 
YSISGRGVDK 
IRVEDENDMK 
SPGLFSVHPS 
NAPTPRQMAY 
BTNEGVLSW 
CTPAAQYVRI 
KIIiDREVETP 
YTDILAVOPD 
TVKDRAGQAA 
VOGVFGATKG 
MKNGGQETIE 
EESIRGarC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



Seq ID NO: 528 ONA sequence 

Nucleic Acid Accession ft: NM_001941.2 

Coding sequence: 64.. 2754 
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I 

GGCAGGTCTC 
CCGATGGCCG 
CTGACCCTCG 
CCTTCTAAAC 
TCTGCAGACC 
TACACAGCCA 
GACAAAAGGA 
TOSAAGACAA 
ATTCCTTGCT 
GAATCTGATG 
AAAGAACCTT 
CCTGTGGATC 
GGATATTCAG 
CACCCTGTTT 
ACTACAGTGG 
CTGAAATACA 



11 
I 

GCTCTCGGCA 
CCGCTGGGCC 
TGATCTTCAG 
TAGAGGCAGA 
TCATCOGGTC 
GGGCTGTTGC 
AACAGACACA 
GACACACTAG 
CTATCCAAGA 
CAGCACAGAA 
TAAATTTGTT 
GTGAAGAATA 
CAGATCTGCC 
TCACAGAAGC 
GGGTGGTTTG 
GCATTTTGCA 
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I 

CCCTCCCGGC 
CCGOCGCTCC 
TCGTGATGGT 
CAAAATAATT 
AAGTGATCCT 
GCTGTCTGAT 
GAAAGAGGTT 
AGAAACTGTT 
GAATTCCTTG 
CTATACTGTC 
TTATATAGAA 
TGATGTTTTT 
CCTCCCACTA 
AATTTATAAT 
TGCCACAGAC 
GCAGACACCA 



31 
I 

GCCOGCGTTC 
GTGOGGGGAG 
GAAGCCTGCA 
GGCAGAGTTA 
GATTTCAGAG 
AAGAAAAGAT 
ACTGTGCTGC 
CTCAGGCGTG 
GGCCCTTTCC 
TTCTACTCAA 
AGAGACACTG 
GATTTGATTG 
CCCATCAGGG 
TTT6AAGTTT 
A6AGATGAAC 
AGGTCACCXG 



41 



TCCTGGCCCT 
CCGTCTGCCT 
AAAAGGTGAT 
ATTTGGAAGA 
TTCTAAATGA 
CATTTACCAT 
TAGAACATCA 
CCAAGAGGAG 
CATTGTTTCT 
TAAGTGGACG 
GAAATCTATT 
CTTATGCGTC 
TAGAGGATGA 
TGGAAAGTAG 
GGGACACAAT 
GGCTCTTTTC 



51 
I 

6CCCGGCATC 
GCATCTGCTG 
ACTTAATGTA 
GTGCTTCAGG 
TGGGTCAGTG 
ATGGCTTTCT 
GAAGAAGGTA 
ATGGGCACCT 
TCAACAAGTT 
TGGAGTTGAT 
TTGCACTCG6 
AACTGCAC5AT 
AAATGACAAC 
TAGACCTGGT 
GCATA06CGC 
TGTGCATCCC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
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AGCACAGGGG TAATCACCAC AGTCTCTCAT TATTTOGACA GAGAGGTTGT ACACAAGTAC 1020 

tCATTGATAA TGWUU3TACA AGACATGGAT GGCCAGTTTT TTGGATTGAT AGGCACATCA 1080 

ACTTGTATCA TAACAGTAAC AGATTCRAAT GATAATGCAC CCACTTTCAC ACAAAATGCT 1140 

TATGAAGCAT TTGTAGAQGA AAATGCATTC AATGTGGAAA TCTCAOGAAT ACCTATAGAA 1200 

GATAAGGATT TAATTAACAC TGCCAATTGG AGAGTCAATT TTACCATTTT AAAGGGAAAT 1260 

GAAAATCGAC ATTTCAAAAT CAGCACAGAC AAAGAAACTA ATGAAGGTGT TCTTrCTGTT 1320 

GTAAAGCCAC TGAATTATGA AGAAAACXX5T CAAGTGAACC TGGAAATTGG AGTAAACAAT 1380 

GAAGGGOCAT TTGCTACAGA TATTCCCAGA GT6ACAGCCT TGAACAGAGC CTTGGTTACA 1440 

GTTCATGTGA GOGATCIGGA TGAGGGGCCT GAATGCACTC CTGCAGCCCA ATATGTGCGG ISOO 

ATTAAAGAAA ACTTAGCAGT GGGGTCAAAG ATCAACX3GCT ATAAGGCATA TGACCCCGAA 1560 

AATAGAAATG GCAATGCTTT AAGGTACAAA AAATTGCATG ATOCTAAAGG TTGGATCACC 1620 

ATTGATGAAA TTTCAGGGTC AATCATAACT TCXaAAATCC TGGATAGGGA GGTTGAAACT 1680 

CCCAAAAAT6 AGTTGTATAA TATTACAGTC CTGGCRATAG ACAAAGATGA TAGATCATGT 1740 

ACTQ6AACAC TTGCIGTGAA CATTGAAGAT GTAAATGATA ATCCACCAGA AATACTTCAA 1800 

GAATATGTAG TCATTTGCAA ACCAAAAATG GGGTATACCG ACATTTTAGC TGTTGATCCT 1860 

GATCAACCTG TCCATGGAGC TCCATTTTAT TTCAGTTTGC CCAATACTTC TCCAGAAATC 1920 

AGTAGACTGT GGAGCCTCAC CAAAGTPAAT GATACAGCTG OCOGTCTTTC ATATCAGAAA 1980 

AATGCTGGAT TTCAAGAATA TACCATTCCT ATTACTGTAA AAGACAGGGC CGGCCAAGCT 2040 

GCAACAAAAT TATreAGAGT TAATCTGTGT GAATGTACTC ATCCAACTCA GTGTCGTGCG 2100 

ACTTCAAGGA CTACAGGAGT AATACTPGGA AAATGGGCAA TC CTTGCAA T ATTACTGGGT 2160 

ATAGCACTGC TCmTCTXTT ATTGCTAACT TTAGTATGTG GAGTTTTTGG TGCAACTAAA 2220 

GGGAAACGTT TTCCTGAA6A TTTAGCACAG CAAAACTTAA TTATATCAAA CACAGAAGCA 2280 

CCTGGAGACG ATAGAGTGTG CTCTGCCAAT GGATTTATGA CCCAAACTAC CAACAACTCT 2340 

AGCCAAGGTT TTTCTOGTAC TATGGGATCA GGAATGAAAA ATGGAGGGCA GGAAACCATT 2400 

GAAATGATGA AAGGAGGAAA CCAGACCTTG GAATCCTGCC GGGGGGCTGG GCATCATCAT 2460 

AOCCTGGACT CCTGCAQGGG AGGACACAOQ GAGGTGGACA ACTGCAGATA CACTTACTCG 2520 

GAGTGGCACA GTTTTACTCA ACCCCGTCTC GGTGAAAAAT . TGCATCGATG TAATCAGAAT 2580 

GAAGACCGCA TGCCATCOCA AGATTATGTC CTCACTTATA ACTATGAGGG AAG AGGATC T 2640 

CCAGCTGGTT CTOTOGGCTG CTGCAGTGAA AACCAGGAAG AAGATGGCCT TGACTTTTTA 2700 

AATAATTTG6 AACOCAAATT TATTACATTA GCAGAAGCAT GCACAAAGAG ATAATGTCAC 2760 

AGTCCTACAA TTAGCTrCTTT GTCAGACATT CTGGAGGTrT CCAAAAATAA TATTGTAAAG 2820 

TTCAATTTCA ACATCTATGT ATATGATGAT TTTTTrCTCA ATTTTGAATT ATGCTACTCA 2880 

CCAATTTATA TTTTTAAAGC CAGTTGTTCC TTATCTTTTC CAAAAAGTGA AAAATGTTAA 2940 

AACAGACAAC TGGTAAATCT CAAACTCCAG CACTGGAATT AAGGTCTCTA AAGCATCTGC 30C0 

TCTTTTTTTT TTTTAC3QGAT ATTTTAGTAA TAAATATGCT GGATAAATAT TAGTCCAACA 3060 

ATAGCTAAGT TATGCTAATA TCACATTATr ATGTATTCAC TTTAAOTGAT AGTTTAAAAA 3120 

ATAAACAAGA AATATTGAGT ATCACTATGT GAAGAAAGTT TTGGAAAAGA AACAATGAAG 3180 

ACTGAATTAA ATTAAAAATG TTGCAGCTCA TAAAGAATTG GGACTCACCC CTACTGCACT 3240 

AGCAAATTCA TTTGACTTTG GAGCCAAAAT GTGTTGAAGT GCCCTATGAA GTAGCAATTT 3300 

TCTATAGGAA TATAGTTCGA AATAAATGTG TGTGTGTATA TTATTATTAA TCAATGCAAT 3360 

ATTTAAAATG AAATGAGAAC AAAGAGGAAA ATGGTAAAAA CTTGAAATGA GGCTGGGGTA 3420 

TAGTTTGTCC TACAATAGAA AAAAGAGAGA GCTTCCTAGG CCTGGGCTCT TAAATGCTGC 3480 

ATTATAACTG AGTCTATGAG GAAATAGTTC CTGTCCAATT TGTGTAATTT GTTTAAAATT 3540 

GTAAATAAAT TAAACTTTTC TGGTTTCTGT GGGAAGGAAA TAGGGAATCC AATGGAACAG 3600 

TAGCTTTGCT TTGCAGTCTG TTTCAAGATT TCTGCATCCA CAAGTTAGTA GCAAACTGGG 3660 

GAATACTCGC TGCAGCTGGG GTTCCCTGCT TTTTGGTAGC AAGGGTCCAG AGATGAGGTG 3720 

TTTTTTTCGG GGAGCTAATA ACAAAAACAT TTTAAAACTT ACCTTTACTG AAGTTAAATC 3780 

CTCTATTGCT GTTTCTATTC TCTCTTATAG TGACCAACAT CTTTTrAATT TAGATCCAAA 3840 

TAACCATGTC CTCCTAGAGT TTAGAGGCTA GAGGGAGCTG AGGG GAGGA T CTTACTGAAA 3900 

GCACCCTGGG GAGATTGATT GTCCTTAAAC CTAAGCCCCA CAAACTTGAC ACCTGATCAG 3960 

GTCTGGGAGC TACAAAATTT CATTTTTCTC CTCACTGCCC TTCTTCT6AG TGGCATTGGC 4020 

CTGAATCAAG GAAAGCCAGG CCTTGTGGGC CCCCTTCTTT CGGCTTTCTG CTAAAGCAAC 4080 

ACCTCCAGCA GAGATTCCCT TAAGTGACTC CAGGTTTTOC ACCATCCTTC AGCGTGAATT 4140 

AATTTTTAAT CAGTTTGCTT TCTCCAGAGA AATTTTAAAA TAATAGAAGA AATAGAAATT 4200 

TTGAATGTAT AAAAGAAAAA GATCAAGTTG TCaTrTTAGA ACAGAGGGAA CTTTGGGAGA 4260 

AAGCAGCCCA AGTAGGTTAT TTGTACAGTC AGAGGGCAAC AGGAA6ATGC AGGCCTTCAA 4320 

GGGCAAGGAG AGGCCACAAG GAATATGGGT GGGAGTAAAA GCAACATCGT CTGCTTCATA 4380 

CTTTTTCCTA GGCTTGGCAC TGCCTTTTCC TTTCTCAGGC CAATGGCAAC TGCCATTTGA 4440 

GTCCGGTGAG GGATCAGCCA ACCTCTTCTC TATGGCTCAC CTTATTTGGA GTGAGAAATC 4500 

AAGGAGACAG AOCTGACPGC ATGATQAGTC TGAAGGCATT TGCAGGATGA GCCTGAACTG 4560 

GTTGTGCAGA ACAAACAAGG CATTCATGGG AATTGTTGTA TTCCTTCTGC AGCCCTCCTT 4620 

CTGGGCACTA AGAAGGTCTA TGAATTAAAT GCCTATCTAA AATTCTGATT TATTCCTACA 4680 

TTTTCTGTTT TCTAATTTGA CCCTAAAATC TATGTGTTTT AGACTTAGAC TTTTTATTGC 4740 

CCCCCCCCCC rm - mi ' iG AGACGGAGTC TCGCTCTGAC GCACAGGCTG GAGTGC»GTG 4800 

GCTCCGATCT CTCCTCACTG AAAGCTCCGC CTCCCGGGTT CATGCCATTC TCCTGC CTCA 4860 

GCCTCCTGAG TAGCTGGGAC TACAGGCGCC CACCACCACG CCCGGCTAAT TTTTTGTATT 4920 

TTTAATAGAG ACGGGGTTTC ACTGTGTTAG CCAGGATGGT CTCGATCTCC TGACCTOGTG 4980 

ATCCGCCTOC CTCGGCCTCC CAAAGIGCTG GGATTACAGG CATGACCCAC CGCTCCOGGC 5040 

CTTGTTTTCC GTTTAAAGTC GTCTTCTTTT AATGTAATCA TTTTGAACAT GTGTGAAAGT 5100 

TCATCATACG AATTGGATCA ATCTTGAAAT ACTCAACCAA AAGACAGTCG AGAAGCCAGG S160 

GGGAGAAAGA ACTCAGGGCA CAAAATATTG GTCTGAGAAT GGAATTCTCT GTAAGCCTAG 5220 

TTCCTGAAAT TTCCTGCTGT AACCAGAAGC CAGTTTTATC TAACGGCTAC TGAAACACCC 5280 

ACTGTGTTTT GCTCACICCC TCACTCA0CX5 ATCAAAACCT GCTACCTCCC CAAGACTTTA 5340 

CTAGTGCCGA TAAACTTTCT CAAAGAGCAA CCAGTATCAC TTCCCTGTTT ATAAAACCTC 5400 

TAACCATCTC ' m xy iTC TTT GAACATGCTG AAAACCACCT GGTCTGCATG TATG CCOGAA 5460 

T7TCTAATTC TTTTCTCTCA AATGAAAATT TAATTTTAGG GATTCATTTC TATATTTTCA 5520 

CATATGTAGT ATTATTATTT CCTTATATGT GTAAGGTGAA ATTTATGGTA TTTGAGTGTG 5580 

CAAGAAAATA TATTTTTAAA GCTTTCATTT TTCCCCCAGT GAATGATTTA GAATTTTTTA 5640 

TGTAAATATA CAGAATGTTT TTTCTTACTT TTATAAGGAA GCAGCTGTCT AAAATGCAGT 5700 

GGGGTTTGTT TTGCAATCTT TTAAACAGAG T T T TAGTATT GCTATTAAAA GAAGTTACTT 5760 

TGCTTTTAAA GAAACTTGGC TCCTTAAAAT AAGCAAAAAT TGGATGCATA AAGTAATATT S820 

TACAGATCTG GGGAGAT6TA ATAAAACAAT ATTAACTTGG TTTCTTGTTT TTGCTGTATT 5880 

TAGAGATTAA ATAATTCTAA GATGATCACT TTGCAAAATT ATGCTTATGG CTGGCATGGA 5940 

AATAGAAATA CTCAATTATG TCTrrGTTGT ATTAATGGGG AAXATTTTGG ACAATGTTTC 6000 

ATTATCAAAT TGTOGACATC ATTAATATAT ATTGTAATGT TGGGAAGAGA TCACTATTTT 6060 

GAAGCACAGC TTTACACATG AGTATCTATG ATACATATGT ATAATAAATT TTGATOGGGT 6120 

ATTAAAAGTA TTAGAAGGTG GTTATAATTG CAGAGTATTC CATGAATAGT ACACTGACAC 6180 
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AGGGGTTTTA CTTTGAGGAC CAGTGTAGTC AAGGGAAAAC 
GGCAATATTG CACTCTTGAT TCTGCCACTT ACAGGkTAGA 
CAAGATGATC CAACCATRAA GGTGCTCTGT GCTTCACAGT 
AGTGTGCTCC CCTACAAACG TTAAGACTGA TCATTTCAAA 
AGCCTTACAT TTTAATATAG GTTGAACCAA AATTTCAATT 
CATTATTTTT GTGTATGTCT TCAAGAATGT TCATTGGATT 
ACCGGATACA TTTCAOQTGT CCTTCAOTAT TGATTTQGTT 
TGAGAAGCAT GGACACTAGA GCCACSAATGC TTGGATATGA 
TTCTGTGTGA CCTTTGAAAG GCTACTTATT TCCTCTCTTA 
GAACAATGCC ACCCTCATGG GGTTGTTGAA TGATTAAATT 
ATAGAACACT GCCTGCACAT AGZAAAAGAA TTATAAGTGT 
GTAGTTGGAT ATAC7AC0SA ACAATATCTA ATCTCTTTTT 
TATATATAAT CCCGAAACAT G 

Seq ID NO: 529 Protein sequence 
Protein Accession tf: NP 001932.1 



ATGAGTTAAA 
TAATGCCTGA 
GAATCTTTTC 
AATCTATTAG 
CCAGTAACTT 
TTTGTTTGTA 
GAATATTGGG 
ATCCTGGATC 
GCTTTCTCAT 
AGTTAAtATA 
GAGGTAGTTG 
AGGGAAATAA 



AAGAAAAGCA 
ACTTTAATGA 
CCCATGCAGG 
CTATATCAAA 
CTA7TG7AAC 
ATAGTAAAAT 
TCATAATGGT 
TGTCACTTAC 
TAAAATCAAT 
CCTAAAGTAC 
GTAAAATTAT 
AGTTTGTGCA 



6240 
6300 
6360 
6420 
6460 
6540 
6600 
6660 
6720 
6780 
6840 
6900 



PCTAJS02/12476 



MAAAGPRRSV 
ADLIRSSDPD 
KTRHTRETEVL 
EPLMLFYIER 
PVFTEAIYNP 
TGVXTTVSHY 
EAFVEQIAFN 
KPLNYEENRQ 
KENLAVGSKI 
KNELYHITVL 
EPVHGAPFYF 
TKLLRVNLCE 
KRFPEDIiAQQ 
MMKGGNC2TLE 
DRMPSQDYVL 



11 

I 

RGAVCLAIiLL 
FRVLKDGSVY 
RBAKRRWAPI 
DTGNLFCTRP 
BVLES5RPGT 
LDREWDKYS 
VEILRIPIED 
VNLEIGVNNE 
NGYKAYDPEN 
AIDKDDRSCT 
SLPNTSPEIS 
CTHPTQCRAT 
NLIISNTEAP 
SCRGAGHHHT 
TYNYBGRGSP 



31 
I 

ACKKVILNVP 
KRSPTIWLSD 
PFPLFLQQVE 
LIAYASTADG 
DEPDTKHTRL 
GFFGLZGTST 
VNFTILKGNB 
TALNRALVTV 
LHDPKCWITI 
KDMPPEILQE 
TAARLSYQKN 
HAIUIILLGX 
FMTQTTNNSS 
VDNCRYTySE 
AGSVGCCSEK QESXSLDFLN 



21 
I 

TLVIFSRSGE 
TARAVALSDK 
PCSKQENSLG 
VDREBVDVFD 
TVGWCATOR 
I.IMKV0DXD6 
KDliINTANWR 
APPARDIPRV 
RNGNGLRVKK 
GTLAVNIEDV 
RLWSLTKVND 
SRSTGVILGK 
GDDRVCSAHG 



Seq ID NO: 530 Z)HA sequence 

Nucleic Acid Accession tf: KM_0165B3.2 

Coding sequence : 72 . . 842 



GGAGTGGGGG 
TAAGAGCAAA 
CCATGGCCCA 
ATCCAGCCCT 
ATGGCCTGCT 
TGAAGCCTGG 
CAGTGATTCC 
AACTTGGCCT 
TAAAGCTCCA 
TGGACATCAC 
TTGGTGACTG 
CCCTCCCCAT 
AGTTGGTTCA 
CCCTGGTGCA 
AAGCCTTCCA 
GCCCATGTGC 
TCCCACCAGG 
AAAAAAAAAA 



11 
I 

AGAGAGAGGA 
GATGTTTCAA 
GTTTGGAGGC 
GCCCTTGAGT 
GTCTGGGGGC 
AGQAGGTACT 
TGGCCTGAAC 
T6TGCM3AGC 
AGTGAATACG 
TGCAGAAATC 
CACCCATTCC 
TCAAGGTCTT 
GGGCAACGTG 
TGACATTGTT 
GGAAGGGGCT 
TGGAAGATGA 
CGTGTGTAAC 
AAAAAAAAAA 



21 
I 

GACCAGGACA 
ACTGGGGGCC 
CTGOCCGTGC 
CCCACAGGTC 
CTGTTGGGCA 
TCTGGTGGCC 
AACATCATTG 
CCTGATGGCC 
CCCCTGGTOG 
TTAGCTGTGA 
CCTGGAAGCC 
CTGGACAGCC 
TGCCCTCTGG 
AACATGCTGA 
GGCCTCTGCT 
CACAGTTGCC 
ATCCCATGTG 
AAAAAAAAA 



31 
1 

GCTGCTGAGA 
TCATTGTCTT 
OCCTGGACCA 
TTGCAGGAAG 
TTCTGGAAAA 
TCCTTGGGGG 
ACATAAAGGT 
ACOGTCTCTA 
GTGCAAGTCT 
GAGATAA6CA 
TGCAAATTTC 
TCACAGGGAT 
TCAATGAGGT 
TCCAOGGACT 
GAGCTGCTTC 
TTCTCTCCGA 
CCTCACCTAA 



41 
I 

SKZiEADKZIG 
KRKOTQKBVT 
SDAA^IYTVF 
YSADLPLPLP 
KYSILQCrrPR 
CZZTVTDSND 
NGHFKISTDK 
HVRDLDEGPB 
DEISGSIITS 
YWICKPKMG 
AGFQEYTIPI 
ALIiPSVLLTL 
QGFCGTKGSG 
WHSFTQPRLG 
NLEPKFITLA 



41 
I 

CCTCTAAGAA 
CTACGGGCTG 
GACCCTGCCC 
CTTGACAAAT 
CCTTCCGCTC 
ACTGCTTGGA 
CACTGACCCC 
TGTCACCATC 
GTTGAGGCTG 
GGAGAGGATC 
TCTGCTTGAT 
CTTGAATAAA 
TCTCAGAGGC 
ACAGTTTGTC 
CCAGTGCrCA 
GGAACCTGGC 
TAAAATGGCT 



51 
I 



VLtiEKQKKVS 
YSISGRGVDK 
IRVEDENDNH 
SPGLFSVHPS 
NAPTFRONAY 
ETNEGVLSW 
CTPAAQYVRI 
KILDREVETP 
YTDIIiAVDPO 
TVKDRAGQAA 
VCGVFGATKG 
MKHGGQETXE 



EACTKR 



51 

I 

GTCCAGATAC 
TTAGCCCAGA 
TTGAATGTGA 
GCCCTCAGCA 
CTGGACATCC 
AAAGTGACGT 
CAGCTGCTGG 
CCTCTC3GGCA 
GCTGTGAAGC 
CACCTGGTCC 
GGACTTGGCC 
GTCCTGCCTG 
TTGGACATCA 
ATCAAGGTCT 
CAGATGGCTG 
CCCTCTCCTT 
CTTCTTCTGC 



Seq ID NO: 531 Protein sequence 
Protein Accession ft: NP_0S7667.l 



11 



21 31 41 51 

I i I I I t 

MFQTGGLIVF YGLLAQTMAQ FGGLPVPLDQ TLPLNVNPAL PLSPTGIAGS LTNALSNGLL 
SGGLbGILES LPLLDIIjKPG GGTSGGLLGG LLGKVTSVIP GLNNIIDIKV TDPQLUSIiGL 
VQSPDGHRLY VTIPLGIKLQ VNTPLVGASL LRLAVKLDIT AEZLAVRDKQ ERIHLVI/3DC 
THSPGStiQIS IiLDGLGPLPI QGLLDSLTGI LNKVLPELVQ GNVCPLVNEV LRGLOITLVH 
DXVKMIiIHGL QFVIKV 



seq ID NO: 532 DMA Sequence 
Nucleic Acid Accession H: im_004363.1 
Coding sequence: lis 



I 

CTCAGGGCAG 
TCCTGGAACT 
TCTCCCTCGG 
TCACTTCTAA 
TTCAATGTCG 
TTTGGCTACA 
GTAATAGGAA 
CCCAATGCAT 



11 
I 

AGGGAGGAAG 
CAAGCTCTTC 
CCCCTCCCCA 
CCrrCTGGAA 
CAGAGGGGAA 
GCTGGTACAA 
CTCAACAAGC 
COCTGCTGAT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 



60 
120 
180 
240 



,2223 










21 


31 


41 


51 




1 

GACAGCAGAC 


1 

CAGACAGTCA 


1 

CAGCAGCCTT 


1 

GACAAAACGT 


60 


TCCACAGAGG 


AGGACAGAGC 


AGACAGCAGA 


GACCATGGAG 


120 


CAGATGGTGC 


ATCCCCTGGC 


AGAGGCTCCT 


GCTCACAGCC 


180 


CCGGCCCACC 


ACIGCCAAGC 


TCACTATTGA 


ATCCAOGCCG 


240 


GGAGGTGCTT 


CTACTTGTCC 


ACAATCTGCC 


GCAGCATCTT 


300 


AGGTGAAAGA 


GTGGATGGCA 


ACCGTCAAAT 


TATAGGATAT 


360 


TACCCCAGGG 


CCGGCATACA 


GT6GTCGAGA 


GATAATATAC 


420 


OCAGAACATC 


ATCCAGAATG 


ACACAGGATT 


CTACACCCTA 


480 



389 
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CAOGTCATAA AGTCAGATCT TGTGAATGAA GAAGCAACTG GCCACTTCaS GGTATACCCG 540 

GAGCTGCCCA AGCCCTCCAT CTCCAGCAAC AACTCCAAAC CCGTGGAGGA CAAGGATGCT 600 

GTGGCCrrCA CCTGTGAACC TGAGACTCAG GACGCAACCT ACCPGXGGTG GGTAAACAAT €60 

CAGAGCCTCC OGGTCAGTCC CAGGCTGCAG CTGTCCAATG GCAACAGGAC CCTCACTCTA 720 

TTCAATGTCA CAAGAAATGA CACAGCAAGC TACAAATGT6 AAACCCAGAA OCCAGTGAGT 780 

GCCAGGOGCA GTGATTCAGT CATCCTGAAT GTCCTC2ATG GCOOGGATGC OCCCACCATT 840 

TCCCCTCTAA ACACATCTTA CAGATCAGGG GAAAATCTGA ACCTCTCCTG CCACGCAGCC 900 

TCTAACCCAC CTGCACAGTA CTCTTGGTTT GTCAATGGGA CTTTCCAGCA ATCCACCCAA 960 

GAGCTCTTTA TCCCCAACAT CACTGTGAAT AATAGTGGAT CCTATAOGTG CCAAGCCCAT 1020 

AACTCAGACA CTGGCCTCAA TAGGACCACA GTCACGACGA TCACAGTCTA TGCAGAGCCA 1080 

CCCAAACCCT TCATCACCAG CAACAACTCC AACCCGGT6G AGGATGAGGA TGCTGTAGCC 1140 

TTAACCT6T6 AACCTGAGAT TCAGAACACA ACC7ACCIGT GGTGGGTAAA TAATCAGAGC 1200 

CTCCCGGTCA GTCCCAGGCT GCAGCTGTCC AATGACAACA GGACCCTCAC TCTACTCAGT 1260 

GTCACAAGGA ATGATGTAGG ACCCTATGAG TGTGGAATCC AGAACGAATT AAGTGTTGAC 1320 

CACAGCGACC CAGTCATCCT GAATGTOCTC TATGGCCCAG AOGACCCCAC CATTTCCCCC 1380 

TCATACACCT ATTACSOBTCC AGG^SGTGAAC CTCAGCCTCT CCTGCCATGC AGCCTCTAAC 1440 

CCACCTGCAC AGTATTCTTG GCTCATTGAT GGGAACATCC AGCAACACAC ACAAGAGCTC ISOO 

TTTATCTCXZA ACATCACTGA GAAGAACAGC GGACTCTATA OCTGCCAGGC CAATAACTCA 1560 

GCCAGTGGCC ACAGCAGGAC TACAGTCAAG ACAATCACAO TCTCTG 06GA GCTGCCCAAG 1620 

CCCTCCATCT CCAGCAACAA CTCCAAACCC GTGGACGACA AGGATGCTGT GGCCTTCACC 1680 

TGTGAACCTG AGGCTCAGAA CACAACCTAC CTGTGGTGGG TAAATGGTCA GAGCCTCCCA 1740 

GTCAGTCCXIA GGCTGCAGCT GTCCAATGGC AACAGGACCC TCACTCTATT CAATGTCACA 1800 

AGAAATGACG CAAGAGCCTA TGTATGTGGA ATCCAGAACT CAGTX3AGTGC AAACCGCAGT 1860 

GACCCAGTCA COCTGGATGT CCTCTATGGG COGGACAOCC CCATCATTTC CCCCCCAGAC 1920 

TOGTCTTACC TTTOGGGAGC GAAOCTCAAC CTCTCCTGCC ACTCGGCCTC TAACtXATCC 1980 

CCGCAGTATT CTTGGCGTAT CAATGGGATA CCX3CAGCAAC ACACACAAGT TCTCTTTATC 2040 

GCCAAAATCA CGCCAAATAA TAACX3GGACC TATGCCTGTT TTGTCTCTAA CTTGGCTACT 2100 

GGCCGCAATA ATTCCATAGT CAAGAGCATC ACAGTCTCTG CATCTGiSAAC TTCtCCTGGT 2160 

CTCTCRGCTG GQGCCACTGT OGGCATCATQ ATTGGAGTOC TGGTTGGGGT TGCTCTGATA 2220 

TAGCAGCCCr GGTGTAGTTT CTTCATTTCA GGAAGACTGA CAGTTGTTTT GCTTCTTCCT 2280 

TAAAGCATTT GCAACAGCTA CAGTCTAAAA TTGCTTCTTT ACCAAGGATA TTTACAGAAA 2340 

AGACTCTGAC CAGAGATCGA GACCATCCTA GCCAACATCG TGAAACCCCA TCTCTACTAA 2400 

AAATACAAAA ATGAGCTGGG CTTGGTGGCG CGCACCTGTA GTCCCAGTTA CTCGGGAGGC 2460 

TGAGGCAGGA GAATCX3CTTG AACCCGGGAG GTGGAGATTG CAGTGAGCCC AGATCGCACC 2 520 

ACTGCACTCC AGTCTGGCAA CAGAGCAAGA CTCCATCTCA AAAAGAAAAG AAAAGAAGAC 2580 

TCTGACCTGT ACTCTTGAAT ACAAGTTTCT GATACCACTG CACTGTCTGA GAATTTCCAA 2640 

AACTTTAATC AACTAACTGA CAGCTTCATG AAACTGTCCA CCAAGATCAA GCAGAGAAAA 2700 

TAATTAATTT CATGGGACTA AATGAACTAA TGAGGATTGC TGATTCTTTA AATGTCTTGT 2760 

TTCCCAGATT TCACGAAACT Trrm 'C TT T TAAGCTATCX: ACTCTTACAG CAATTTGATA 2820 

AAATATACTT TTGTGAACAA AAATTGAGAC ATTTACATTT TCTCCCTATG TGGTCGCTCC 2 880 

AGACTTGGGA AACTATTCAT GAATATTTAT ATTGTATGGT AATATAGTTA TTGCACAAGT 2940 
TCAATAAAAA TCTGCTCTTT GTATAACAGA AAAA 



Seg ID NO: 533 Protein sequence 
Protein Accession fi: NP_004354.1 

1 11 21 31 41 51 

1 i i I 1 1 

MESPSAPPHR WCIPWQRLLL TASLLTFWNP PTTAKLTIES TPPNVAEGKE VXJiLVHNLPQ 60 

HLFGYSHYKG BRVDGNRQII GYVIGTQQAT PGPAYSGREI lYPNASLLIQ NIIQNDTGFY 120 

TLHVIKSOLV NEEATGQFRV YPELPKPSIS SNKSKFVEDK DAVAFTCBPE TQDATYLMWV 180 

NNQSLPVSPR LQLSNGNRTL TLFNVTRNDT ASYKCETQNP VSABRSDSVI tWVLYGPDAP 240 

TISPLNTSYR SGENLNI.SCH AASNPPAQYS WPVNGTFQQS TQELPIPNIT VNNSGSYTCQ 300 

AHNSDTGLNR TTVTTITVYA EPPKPFITSN NSNPVEDBDA VALTCEPEIQ NTTYLWWVNN 360 

QSXiPVSPRLQ LSNHMSTLTL LSVTRNDVGP YECGIQNELS VDHSDPVILN VLYGPDDPTI 420 

SPSYTYYRPG VNLSLSCHAA SNPPAQYSWL IDGNIQQHTQ ELFISNITEK NSGLYTOQAK 480 

NSASGHSRTT VKTITVSAEL PKPSISSKNS KPVEDKDAVA FTCEPEAQinP TYLWWVKGQS 540 

LPVSPRLQLS NGNRTLTLFN VTRNDARAYV OGIQNSVSAN RSDPVTLDVL YGPDTPIISP 600 

PDSSYIiSGAN LNLSCHSASN PSPQYSWRIN GIPQQHTQVL FIAKITPHNN GTYACFVSNI. 660 
ATGRNNSIVK SITVSASGTS PGLSAGATVG IMIGVLVGVA LI 

Scq ID NO: 534 DNA sequence 

Kucleic Acid Accession ff: NM_006952.1 

Coding sequence: 11.. 793 

1 11 21 31 41 51 

1 I I I I i 

AATCCOGACA ATGGCGAAAG ACAACTCAAC TGTTCGTTGC TTCCAGGGCC TGCT GATTT T 60 

TGGAAATGTG ATTATTGGTI OTTGCGGCAT TGCCCTGACT GCGGAGTGCA TCTTCTTTGT 120 

ATCTGACCAA CACAGCCTCT ACCCACTGCT TGAAGCCACC GACMiCGKCQ ACATCTATGG 180 

GGCTGCCTGG ATCGGCATAT TTGTGGGCAT CTGCCTCTTC TGCCTGTCTG TTCTAGGCAT 240 

TGTAGGCATC ATGAAGTCCA GCAGGAAAAT TCTTCTGGCG TATTTCATTC TGATGTTTAT 300 

AGTATATGCC TTTGAAGTGG CATCTTGTAT CACAGCAGCA ACACAACDGAG ACTTTTTCAC 360 

ACCCAACCTC TTCCTGAAGC AGATGCTAGA GAGGTACCAA AACAACAGCC CTCCAAACAA 420 

TGATGACCAG TGGAAAAACA ATGGAGTCAC CAAAACCTGG GACAGGCTCA TGCTCCAGGA 480 

CAATTGCTGT GGCGTAAATG GTCCATCAGA CIGGCAAAAA TACACATCTG CCTTGOGGAC 540 

TGAGAATAAT GATGCTGACT ATCCCTGGCC TOGTCAATGC TGTGTTATGA ACAATCPTAA 600 

AGAACCTCTC AACCTGGAG6 CTTGTAAACT AGGCGTGCCT GGTTTTTATC ACAATCAGGG 660 

CTGCTATGAA CTGATCTCTG GTCCAATGAA CCGACACGCC TGGGGGGTTQ CCTGGTTTGG 720 

ATTTGCCATT CTCTGCTGGA CTTTTTGGGT TCTCCTGGGT ACCATGTTCT ACTGGAGCAG 780 
AATTGAATAT TAAGAA 

Seq ID NO: 535 Protein sequence 
Protein Accession ft-. NP_008883.l 

1 11 21 31 41 51 

i I I 1 I I 
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HAKDNSTVRC FQGLLIFGJIV IIGCCGIALT AECIFFVSDQ HSLYPLLEAT DEJUDrYGAAW 60 

IGIFVGICLP CLSVDGIVGI HS^SSRXIhlA YPILMFIVYA FSVASCITAA TQRDFFTPNL i20 

FXJGQMZiEaYQ NNSFFNNIK9Q HKHZVJVTKTW DRLKU^CC GVNGPSDHQX YTSAFSTEiZN 180 

DADYFHPRQC CVKKHLKEPL MLEACKLSVP GPyBNQGCyB LISGPKMBBA KGVASTFGFAI 240 
LCHTFWVLLG TMFYWSRIEY 

Seq ID NO: 536 DNA Sequence 

K^leic Acid Accession ft: liM_002638.1 

Coding sequence: 120.. 473 

1 11 21 31 41 51 

1 I I 1 i i 

CAATACAGCT AAGGAATTAT CCCTTCTAAA TAOCACAGAC COGCCCTGGA GCCAGGCCAA 60 

GCTGGACTGC ATAAAGATTG GTATGGCCTT AGCTCTTAGC CAAACACCTT CXTTGACACCA 120 

TCAGGGCCAG CAGCTTCTTG ATCGTGGTGG TGTTCCTCAT CGCTGGGACG CTGGTTCTAG 180 

AGGCAGCIGT CACGGGAGTT CCTGTTAAAG GTCAAGACAC TGTCAAAGGC OGTGTTCCAT 240 

TCAATGGACA AGATCCCGTT AAAGGACAAG TTTCW3?TTAA AGGTCAAGAT AAAGT CAAA G 300 

OGCAAGAGCC AGTCAAAGGT CCAGTCTCCA CTAAGCCTGG CTCCTGCCCC ATTATCTT6A 360 

TCCGGTCOGC CATGTTGAAT (XCCCTAACC GCTGCTTGAA AGATACTGAC TGOCCAGGAA 420 

TCAAGAAGTG CTGTGAAGGC TCTTGCGGGA TGGCCTGTTT CGTTCCCCAG TGAAGG6AGC 480 

CGGTCCTTGC IGCACCTGTG COGTCCCCAG AGCTACAGGC CCCATCTGGT CCTAAGTCCC 540 

TCCTGCCCTT CCOCTTCCCA CaCTGTCCAT TCTTCCTCCX: ATTCAGGATC CCCAGQGCTG 600 
GAGCTGCCTC TCTCATCCAC TTTCCAATAA A 

Seq ID NO: 537 Protein sequence 
Protein Accession §t NP_002629.1 

1 11 21 31 ' 41 51 ■ 

MRASSPLIW VFLIAGTLVl, EAAVTGVPVK GODTVKGRVp' PNGQDPVRGQ VSVKGQDKVK 60 
AQEPVKGPVS TKPGSCPIIL IRCAMLNPPN RCLKOTOCPG IKKCXSGSCG MACPVPQ 

Seq ID NO: 53 B DliA sequence 

Nucleic Acid Accession ft: NM_001793.2 

Coding sequence : 71 . . 2560 

1 11 21 31 41 51 

I I i t 1 I 

AAAGGGGCAA GAGCTGAGCG GAACACCGGC CCGCCGTCGC GGCAGCTGCT TCACCCCTCT 60 

CTCTGCAGCC ATGGGGCTCC CTCGTGGACC TCTCGCGTCT CTCCTCCTTC TCCAGGTTTG 120 

CTGGCTGCAG TGCGCGGCCT CCGAGCOGTG CCGGGCGGTC TTCAGGGAGG CTGAAGTGAC 180 

CTTGGAGGC6 GGAGGCGC6G AGCAG6AGCC 06GCCAGGCG CTGGGGAAAO TATTCATGGG 240 

CTGCCCTGGG CAAGAGCCAG CTCTGTTTAG CACTGATAAT GATGACTTCA CTGTGCGGAA 300 

TGGC6AGACA GTCCAGGAAA GAAGGTCACT GAAGGAAAGG AATCCATTGA AGA TCTT CCC 360 

ATCCAAACGT ATCTTACGAA GACACAAGAG AGATTGGGTG GTTGCTCCAA TATCTGTCOC 420 

TGAAAATGGC AAGGGTCCCT TCCCCCAGAG ACTGAATCAG CTCAAGTCTA ATAAAGATAG 480 

AGACACCAAG ATTTTCTACA GCATCAOGGG GCCGGGGGCA GACAGCCCCC CTGAGGGTGT 540 

CTTCGCTGTA GAGAAGGAGA CRGGCTGGTT GTTGTTGAAT AAGCCACTGG ACCGGGAGGA 600 

GATTGCCAAG TATGAGCTCT TTGGCCACGC TGTGTCAGAG AATGGTGCCT CAGTGGAGGA 660 

CCCCATGAAC ATCTCCATCA TCGTGAOCGA CCAGAATGAC CACAAGCCCA AGTTTACCCA 720 

GGACACCTTC CGAGGGAGTG TCTTAGAGGG AGTCCTACCA GGTACTTCTG TGATGCAGGT 780 

GACAGCCACG GATGAGGATG ATGCCATCTA CACCTACAAT GGGGTGGTTG CTTACTCCAT 840 

CCATAGCCAA GAACCAAAGG ACCCACACGA CCTCATGTTC ACCATTCACC GGAGCACAGG 900 

CACCATCAGC GTCATCTCCA GTGGCCTGGA CCGGGAAAAA GTCCCTGAGT ACACACTGAC 960 

CATCCRGGCC ACAGACATGG ATGGGGAOGG CTCCACCACC ACGGCAGTGG CAGTAGTGGA 1020 

GATCCTT6AT GCCAATGACA ATGCTCCCAT GTTTGACCCC CAGAAGTACG AGGCCCATGT 1080 

GCCTGAGAAT GCAGTGGGCC ATGAGGTGCA GAGGCTOACG GTCACTGATC TGGACGCCCC 1140 

CAACTCACCA GCGTGGCGTG CCACCTACCT TATCATGGGC GGTGACGACG GGG ACCA TTT 1200 

TACCATCACC ACCCACCCTG AGAGCAACCA GGGCATCCTG ACAACCAGGA AGGGTTTGGA 1260 

TTTEGAGGCC AAAAACCAGC ACACCCTGTA CGTTGAAGTG ACCAAOGAGG CCCCTTTTGT 1320 

GCTGAAGCTC CCAACCTCCA CAGCCACCAT AGTGGTCCAC GTGGAGGATG TGAATGAGGC 1380 

ACCTGTGTTT GTCCCACCCT CCAAAGTCGT TGAGGTCCAG GAGGGCATCC CCACTGGGGA 1440 

GCCTGTGTGT GTCTACACTG CAQAAGACCC TGACAAGGAG AATCAAAAGA TCACCTACCG 1500 

CATCCTGAGA GACCCAGCAG GGTGGCTAGC CATGGACCCA GACAGTGG6C AGGTCACAGC 1560 

TCTGGGCACC CTCGACCGTG AGGATGAGCA GTTTGTGAGG AACAACATCT ATGAAGTCAT 1620 

GGTCTTGGCC ATGGACAATG GAAGCCXTTCC CACCACTGGC ACGGGAACCC TTCTGCTAAC 1680 

ACTGATTGAT GTCAATGACC ATGGCCCAGT CCCTGAGCCC C6TCAGATCA CCATCTGCAA 1740 

CCAAAGCCCT GTGCGCCAGO TGCTQAACAT CACGGACAAG GACCTGTCTC CCCACACCTC 1800 

CCCTTTCCAG GCCCAGCTCA CAGATGACTC AGACATCTAC TGGAOGGCAG AGGTCAACGA 1860 

GGAAGGTGAC ACAGTGGTCT TGTCCCTGAA GAAGTTCCTG AAGCAGGATA CATATGACGT 1920 

GCACCTTTCT CTGTCTGACC ATGGCAACAA AGAGCAGCTG ACGGTGATCA GGGCCACTGT 1980 

GTGCGACTGC CATGGCCATG TCGAAACCTG CCCTGGACCC TGGAAGGGAG GTTTCATCCT 2040 

CCCTGTGCTG GGGGCTGTCC TGGCTCTGCT GTTCCTCCTG CTGGTGCTGC TTTTGTTGGT 2100 

GAGAAAGAAG CGGAAGATCA AGGAGCCCCT CCTACTCTCA GAAGATGACA CCCGTGACAA 2160 

CGTCTTCTAC TATGGCGAAG AGGGGGGTGG CGAAGAGGAC CAGGACTATG ACATCACCCA 2220 

GCrCCACOGA GGTCTGGAGG 0CAGG00G6A GGT6GTTCTC 0GCAATGA08 TGGCACCAAC 2280 

CATCATCCOG ACACCCATCT ACCGTCCTCG GCCAGCCAAC CCAGATGAAA TCGGCAACTT 2340 

TATAATTGAG AACCTGAAGG CGGCTAACAC AGACCCCACA GCCCCGCCCT ACGACACCCT 2400 

CTTGGTGTTC GACTATGAGG GCAGCGGCTC CX3ACGCCGCG TCCCTGAGCT CCCTCACCTC 24 60 

CTCCGCCTCC GACCAAGACC AAGATTACGA TTATCTGAAC GftGTGGGGCA GCCGCTTCAA 2520 

GAAGCTGGCA GAGATGTACG GTGG066GGA G6ACGACTA6 GOGGCCTGCC TGCAGGGCTG 2580 

GGGACCAAftC GTCAGGCCAC AGAGCATCTC CAAGGGGTCT CAGTTCCCCC TTCAGCTGAG 2640 

GACTTCGGAG CTTGTCAGGA AGTGGCOGTA GCAACTTGGC GGAGACAGGC TATGAGTCTG 2700 

ACGTTAGAGT GGTTGCTTCC TTAGCCTTTC AGGATGGAGG AATGTGGGCA GTTTGACTTC 2760 

AGCACTGAAA ACCTCTCCAC CTGGGCCAGG GTTGCCTCAG AGGCCAAGTT TCCAGAAGOC 2820 

TCTTACCTGC CGTAAAATGC TCAACCCTGT GTCCTGGGCC TGG6CCTGCT GT6ACTGA0C 2880 

TACAGTGGAC TTTCTCTCTG GAATGGAACC TTCTTAGGCC TCCTGGTGCA ACTTAATTTT 2940 
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TTTTTTTAAT GCTATCTTCA AAAOGTTAGA 
GC7GG6CCCA CTGGCOGTCC TGCATTTCTG 
TGGATCTCTG OGTTTTTATA CTGAGTGTGC 
CTTGOGTTGC TATAGATGAA GGGTGAGGAC 
TAAAGAAACT TTTOCCAGAA AAAAA 



GAAAGTrCTT C3UUVAGTGCA GCCX3\GAGCT 3000 

GTTTCCAGAC COO^ArGOCT OCX31TTOGGA. 3060 

CTAQGTTGCC CCTTATTTTT TATTTTOOCT 3120 

AATOGTGTAT ATGTACTAGA ALl'inTlAT 3180 



Seq ID HO: 539 Protein sequence 
Protein Accession i: NPJ0O1784.2 

1 ■ 11 21 31 41 51 

) 1 I i I 1 

KGLPRGPLAS LLLUJVCWLQ CAASEPC31AV FHEAEVTIiEA GGAGQEPGQA LGXVPKGCPG 60 

QEPALFSTW DDFTVRNGET VQERRSLKER KPLKIFPSKR ILRRHKRDWV VAPISVPESG 120 

RGPFPQSLMQ LKSNKDRDTK IPYSITGK5A DSPPEGVFAV EKETGW*LLN KPLDREEIAK 180 

YELFGKAVSB NGASVEDPKN ISIIVTCQND HKPKFTQDTF RGSVLfiGVLP GTSVMQVTAT 240 

DEDDAIYTYN GWAYSIHSQ EPKDPHDIWP TIHRSTOTIS VZSSGLOREK VPEYTLTIOA 300 

TDMDGDGSTT TAVAWEILD ANDNAPMFDP QKYEAHVPEN AVCSBVQRLT VTDLDAPNSP 360 

AKRATYIilMG GDDGDHFTIT THPESNQGIL TTRKGLDPEA KNQHTLYVEV TNEAPFVLKL 420 

PTSTATIWH VEDVNEAFVF VPPSKWEVQ EGIPTGBPVC VYTAEDroKE NQKISYRILR 480 

OPAGWLAMDP DSGQVTAVGT LDREDEQFVR KNIYEVMVItA MDMGSPFTTG TGTLLLTLID 540 

VNDHGPVPBP RQITICNQSP VRQVUIITDK DLSPHTSPFQ AQLTDDSDIY HTACVMEEGD 600 

TWLSLKKFL KQDTYCVHLS LSDUGNKEQL TVIRATVOX: HGEVETCPGP HKQGPZLPVL 660 

GAVLALLFLI* LVLLLLVRKK RKIKBPIOiLP EDDTRDMVFY YGEBGGGEED QDYDITQZiHR 720 

GLEARPEWL RNDVAPTIIP TPMYRPRPAN PDEIQIPIIB NLKAANTDPT APPYDTLLVF 780 
OYEGSGSDAA ShSShTSSAS DQDQDYDYI^ EWGSRFKKLA DMYGGGEDD 



seq ID NO: 540 DMA sequence . 

Nucleic Acid Accession jtt Eos sequence 

Coding sequence : 1 . . 672 



1 11 21 31 41 51 

I 1 i 1 I i 

ATGAGGCTCC AAAGACCCCG ACAGGCCCCG GCGGGTGGGA GGOGOGGGCC CCGGGGCGGG 60 

06GGGCTGCC CCTACOGGCC AGAGCGGGGG AGAGGOGGGC GGAGGCIG06 AAGGTTCCAC 120 

AAGG60GGG6 A0GGOGO6CC GOSOSCTGAC CCTCCCT6G6 CAOOGOnGGG GACGATOGOG 180 

CTGCTCGCCT TGCTGCTGGT OSTGGCCCTA COSCGGGTCT GGACAGAOGC CAACCTGACT 240 

GOGAGACAAC GAGATCCAGA GGACTCCCAG CGAACGGAGQ AGGGT6ACAA TAGAGTGTGG 300 

TGTCATGTTT GTGAGAGAGA AAACACTTTC GAGTGCCAGA ACCCAAGGAG GTGCAAATGG 360 

ACAGAGCCAT ACTGOGTTAT AGCGGCCGTG AAAATATTTC CACGTTTTTT CATGGTTGCG 420 

AAGCAGTGCr COGCTGGTTG TGCAGCGATG GAGAGACCCA AGCCAGAGGA GAAGCGGTTT 480 

CTCCTGGAAG AGCCCATGCC CTTCTTTTAC CTCAAGTGTT GTAAAATTCG CTACTGCAAT 540 

TTAGAGG6GC CACCTATCAA CTCATCAGTS TTCAAAGAAT AIXSCTGOGAG CAT6GGTGAG £00 

AGCTGTGGTG GGCTGTGGCT GGCCATCCTC CTGCTGCTGG CCTCCRTTGC AGCCGGCCTC 660 
AGCCTGTCTT GA 



Seq ID NO: 541 Protein sequence 
Protein Accession Eos sequence 



I 11 21 31 41 51 

I 1 I I 1 1 

MRLQRPRQAP AGGRRAPRGG RGSPYRPDPG RGARRUIRFQ KGGEGAPRAD PFWAPLGTMA 60 
LLAUiLWAL PRVWTDANIiT ARQRDPEDSQ RTDEGDNRVW CHVCERENTP EOQNPRRCKH 120 
TEPYCVIAAV KIPPRFFMVA KQCSAGCAAM BRPKPEEKRF LLKEPMPFPY LKCCKXRYOI 180 
LE6PPINSSV FKEYAGSMGC SCGGLtCLAIL ULASZAAGL 5LS 



Seq ID NO: 542 OKA sequence 

Nucleic Acid Accession #: XM_035292.2 

Coding sequence: 53.. 1576 



I 11 21 31 41 51 

I f I I I I 

GCTOGCTGGG CCGCGGCTCC OGGGTGTCCC AGGCCGGGCC GGTGCGCAGA GCATGGCGGG 60 

TGGGGGCCGG AAGCGGCGCG OGCTAGCGGC GCOGGOGGCC GAGGAGAAGG AAGAGGCG06 120 

GGAGAAGATG CTGGCCGCCA AGAGCGCGGA CGGCTOGGCG CCGGCAGGOG AGGGCGAGGG 180 

CGTGACCCTG CAGCGGAACA TCACGCTGCT CAAGGGOGTG GCCATCATOG TGGGGACCAT 240 

TATCX3GCTCG GGCATCTTCG TGACGCCCAC GGGCGTGCTC AAG6AGGCAG GCTCGCOGGG 300 

GCTGGOGCTG GTGGTGTGGG COGCGTGCGG C6TCTTCTCC ATCGTGGGOG CGCTCTGCTA 360 

CGGGGAGCTC GGCACCACCA TCTCCAAATC GGGCQGCGAC TAOSCCTACA 7GCT6GA0GT 420 

CTACGGCT06 CTGCCCGCCT TCCTCAAGCT CTGGATCGAG CTGCTCATCA TCCGGCCTTC 480 

ATCGCAGTAC ATOGTGGCCC TGGTCTTCGC CACCTACCTG CTCAAGCCGC TCTTCCCCAC 540 

CT6CCCGGTG CCCGAGGAGG CAGCCAAGCT CGTGGCCTGC CTCTGCGTGC TGCTGCTCAC 600 

GGCCGTGAAC TGCTACAGCG TGAAGGCC6C CACCCGGGTC CAGGATGCCT TTGCCGCCGC 660 

CAAGCTCCTG G0CCTG6CCC TGATCATCCT GCTGGGCTTC GTCCAGATCG GAAAGGGTGA 720 

TGTGTCCAAT CTAGATCCCA ACTTCTCATT TGAAGGCACC AAACTGGATG TG6GGAACAT 780 

TGTGCTGGCA TTATACAGCG GCCTCTTTGC CTATGGAGGA TGGAATTACT TGAATTTOGT 840 

CACAGAGGAA ATGATCAACC CCTACAGAAA CCTGCCCCTG GCCATCATCA TCTCCCTGCC 900 

CATCXyPGAOS CTGGTGTACG TGCTGACCAA CCXX3GCCTAC TTC^CCACCC TGTCCACCGA 960 

GCAGATGCTG TCX?rCCGAGG CCGTGGCCGT GGACTTOGGG AACTATCACC TGGGCGTCAT 1020 

GTCCTGGATC ATCCCCGTCT TCGTGGGCCT GTCCTGCTTC GGCTCCX3TCA ATGGGTCCCT 1080 

GTTCACATCC TCCAGGCTCT TCTTOGTGGG GTCCOGGC3AA GGOCACCTGC CCTCCATCCT 1140 

CTCCATGATC CACCCACAGC TCCTCACCCC G6TGCCGTCC CTCGTGTTCA CGTGTGTGAT 1200 

GAOGCTGCTC TACGCCTTCT CCAAGGACAT CTTCTCCGTC ATCAACTTCT TCAGCTTCTT 1260 

CAACTGGCTC TGOGTGGCCC TGGCCATCAT CGGCATGATC TGGCTGCGCC ACAGAAAGCC 1320 

TGAGCTTGAG CGGOCCATCA AGGTGAACCT GGCCCTGCCT GTGTTCTTCA TCCTGGCCTG 1380 

CCTCXTCCTG ATCCCCGTCT CCTTCTGGAA GACACCCGTG GAGTGTGGCA TCGGCTTCAC 1440 

CATCATGCTC AGOGGGCTGC CCGTCTACTT CTTCGGGGTC TGGTG6AAAA ACAAGCCCAA ISOO 

GT6GCTCCTC CAGG6CATCT TCTCCACGAC GBTGCTGTGT CAGAAGCTCA TGCAGGTGGT 1560 
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CCCXXSUGGAG ACATAGOCAG GAGGCOGAGT OGCTGCGGGJV GGAGCftTGC 



Seq ID HO: 543 Protein eequence 
^ Protein Acceaaion ft: XP_035292.2 

1 11 21 31 41 51 

} I t I i I 

MAGAGPXRRA LAAPAAEEKE EAREKMLAAK SADGSAPAGE GEGVTLQRNI TLLNGVAIIV '60 

GTIIGSGIFV TPTGVLKEAG SPGLALVV«A AOGVFSIVGA LCYAELGTTI SKSGGDYAYM 120 

10 LEVYGSLPAP LKUfXBLLII HPSSQYIVAL VFATYLLRPL FPTCPVPBBA AKLVACLCVL 180 

U.TAVNCYSV KAATRVQDAF AAAKI*LALAZ* IILL6FVQIG XGDVSKLDPN FSFBGTKZJ)V 240 

GIIVLALYSG LPAYGGWUYL NFVTEEMINP YRHLPLAIII SLPIVTLVYV LTNLAYFTTI. 300 

STEQMLSSEA VAVDFGNYHL GVMSWIIPVF VGLSCFGSVSJ GSLFTSSRLF FVQSREGHLP 360 

SILSMIHPQI- LTPVPSLVPT CVMTLLYAFS KDIFSVINPP SFF2JWLCVAL AIIGMIMLJiH 420 

15 RKPELERPIK VNLALPVFFI lACLFLIAVS PWKTPVEOGI GFTIILSGLP VYPFGVWWKN 480 
XPKHIiLQGIP STTVLOQKLM QWPQET 

Seq ID NO: 544 DNA sequence 
Nucleic Acid Acceasion ft: llM_0OS26B.l 
20 Coding sequence: 168.. 989 

1 11 21 31 41 51 

I I I I i I 

TAAAAAGCAA AAGAATTCGC GGCCGCGT06 ACACGG6CTT CXXX3GAAAAC CTTCCCOGCT 60 

25 TCTQGATATG AAATTCAAGC TGCTTGCTGA GTCCTATTGC CG6CTGCTGG GAGCCAGGAG 120 

AGCCCTGAGG AGTAGTCACT CAGTAGCAGC TGACGCGTGG GTCCACCATG AACTGGAGTA 180 

TCTTTGAGGG ACTCCTGAGT GGGGTCAACA AGTACTCCAC AGCCTTTGGG CGCATCTGGC 240 

TGTCTCTGGT CTTCATCTTC CGCGTGCTGG TGTACCTGGT GACGGOCGAG CGTCTGTGGA 300 

6TGATGACCA CAAGGACTTC GACTGCAATA CTOGCCAGCC CGGCTGCTCC AAOCTCTGCT 360 

30 TtGATGAGTT CiTCCCrGTG TCCCATGTOC GCCTCTGGGC CCTGCAGCTT ATCCTGGTGA 420 

CATGCCCCTC ACTGCTOGTG GTCATGCACG TGGCCTACCG GGA6GTTCAG GAGAAGAGGC 480 

A0CX3AGAAGC CCATGGGGAG AACAGTGGGC GCCTCTACCT 6AACC0CGGC AAGAAGCGGG 540 

GTGGGCTCTG GTG6ACATAT GTCTGCAGCC TAGT6TTCAA GGCGAGCGTG GACATCGCCT 600 

TTCTCTATGT GTTCCACTCA TTCTACCCCA AATATATCCT CCCTCCTGTG GTCAAGTGCC 660 

35 AOGCAGATCC ATGTCCCAAT ATAGTGGACT GCTTCATCTC CAAGCCCTCA GAGAAGAACA 720 

TTTTCACCCT CTTCATGGTG GCCACAGCTG CCATCTGCAT CCTGCTCAAC CTOGTGGAGC 780 

TCATCTACCT GGTGAGCAAG AGATGCCAGG AGTGCCTGGC AGCAAGGAAA GCTCAAGGCA 840 

TCTGCACAGG TCATCACCCC CAOGGTACCA CCTCTTCCTG CAAACAAGAC GACCTCCTTT 900 

CGGGTGACCT CATCTTTCTG GGCTCAGACA GTCATCCTCC TCTCTTACCA GACCGCCCCX 960 

40 GAGACCATGT GAAGAAAACC ATCTTGTGAG GGGCTGCCTG GACTGGTCTG GCAGGTTGGG 102 0 

CCTGGATGGG GAGGCTCTAG CATCTCTCAT AGGTGCAACC TGAGAGTGGG GGAGCTAAGC 1080 

CATGAGGTAG GGGCAGCCAA GAGAGAGGAT TCAGAG6CTC TGGGAGGCAG TTCXTTAGTCC 1140 

TCAACTCCAG CCAOCTGCCC CAGCTGGACG GCACTGGGCC AGTTOCCOCT CTGCTCTGCA 1200 
GCTOGGTTTC CTTTTCTAGA ATGGAAATAG TGAGGGCCAA TGC 



45 



60 
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Seq ID NO: 545 Protein sequence 
Protein Accession #: NP_0052S9.1 



1 11 21 31 41 51 

SO 1 I t I I I 

MNWSIFEGLL SGVNKYSTAP GRIHLSLVFT FRVLVYLVTA ERVWSDDHKD FOCNTRQPGC 60 
SNVCFDEFFP VSHVRLWALQ LILVTCPSLL WMHVAYREV QEKRHREAHG ENSGRLYLNP 120 
GKKRGGLWWT YVCSLVFKAS VDIAFIiYVFH SFYPKYILPP WKCHADPCP NIVDCFISKP 180 
SEKNIFTLFM VATAAICILL NLVELIYLVS KRCHECLAAR KAQAMCTGHH PHGTTSSCKQ 240 
55 DDLLS6DLIP LGSDSHPPLl. PDRPSOHVKK TIL 



Seq ID NOi 546 DNA sequence 

Nucleic Acid Accession NM_002391.1 

Coding sequence: 26..4S7 



1 11 21 31 41 51 

I I I I 1 ) 

OGGGCGAAGC A6GGCGGGCA GCGAGATGCA GCACCGAGGC TTCCTCCTCC TCACCCTCCT 60 

OGCCCTGCTG GCGCTCACCT CCGCGGTCGC CAAAAAGAAA GATAAGGTGA AGAAGGGOGG 120 

65 CCCGGGGAGC GAGTGCGCTG AGTGGGCCTG GGGGCCCTGC ACCCCCAGCA GCAAGGATTG 180 

CGGCGTGGGT TTCCGCGAGG GCACCTGCGG GGCCCAGACC CAGCGCATCC GGTGCAGGGT 240 

GCCCTGCAAC TGGAAGAAGG AGTTTGGAGC CGACTGCAAG TACAAGTTTG AGAACTGGGG 300 

TGOGTGTGAT GGGGGCACAG GCAQCAAAGT 0C6CCAAGGC ACCCTGAAGA AGGCGCGCTA 360 

^ CAATGCTCA6 TGCCA60AGA CCATGCGCGT CACCAAGCCC T6CACCCCCA AGACCAAAGC 420 

70 AAAGGCCAAA GCCAAGAAAG GGAAGGGAAA GGACTAGAOG CCAAGCCTGG ATGCCAAGGA 480 

GCCCCTGGTG TCACATGGGG CCTGGCCACG CCCTCCCTCT CCCAGGCCOG AGATGTGACC 540 

CACCACTGCC TTCTGTCTGC TCGTTAGCTT TAATCAATCA TGCCCTGCCT TGTCCCTCTC 600 

ACTCCCCAGC CCCACCCCTA AGTGCCCAAA GTGGGGAGGG ACAAGGGATT CTGGGAAGCT 660 

TGAGCCTCCC CCAAAGCAAT GTGAGTCCCA GAGCCCGCTT TTGTTCTTCC CCACAATTCC 720 

75 ATTACTAA6A AACACATCAA ATAAACTGAC TTTTTCCCCC CAATAAAAGC TCTTCTTTTT 780 
TAATAT 



Seq ID NO: 547 Protein sequence 
Protein Accession #: NP_0023e2.1 

1 11 21 31 41 * 51 

I I I I I i 

MQHRGFLLLT LLALLAIiTSA VAKKKDKVKK OGPGSECAEW AHGPCTPSSX DOGVGFREGT 60 
CGAQTQRIRC RVPCNWX2CEP GADOCYKFEH WGACDGGTGT KVRQGTLKXA RYMAQOQETZ 120 
RVTKPCTPiCT KAKAKAKKGK GKD 

Seq ID NO I 548 SKA sequence 



393 



wo 02/086443 
Buclelc Acid Accession S: 
Coding sequence: i..7es 



PCTAJS02/12476 



BM 0067S3.1 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



1 
I 

ATGGATTGGG 
GGGAAGGTGT 
CAGGAAGTGT 
AAAAATGTGT 
CTGATCTTOG 
GAAACCACTC 
ATTAAAAAGC 
TTTTTCOGAA 
TACCACCTGC 
TTTATTTCTA 
ATTTGCATGC 
AGATCAAAGA 
CAGAATGAAA 
. AGCTAA 



11 
I 

GGACGCTGCA 
GGATCACAGT 
GGGGXGAGGA 
GCTATGACCA 
l-CTOCACCCC 
GCAAGTTCAG 
ACAAGGTTOG 
TCATCTTTGA 
CCTGGGTGTT 
GGCCAACAGA 
TGCTTAAOGT 
GAGCACAGAC 
TGAATGAGCT 



21 

! 

CACTTTCATC 
CATCTTTATT 
GCAAGAGGAC 
CTTTTTCOC3G 
AGCGCTGCTG 
GCGAGGAGAG 
GATAGAGGGG 
AGCAGCCTTT 
GAAATGTGGG 
GAAGACOGTG 
GGCAGAGTTG 
GCAAAAAAAT 
GATTTOtfSAT 



31 
I 

GGGGGTGTCA 
TTCCGAGTCA 
TTOGTCTGCA 
GTGTCCCACA 
GTGGCO^TGC 
AAGAGGAATG 
TOGCTGTGGT 
ATGTATGTGT 
ATTGACCCCT 
TTTACCATTT 
TGCTACCTGC 
CAGCOCAATC 
AGTGGTOUU^ 



41 

I 

ACAAACACTC 
TGATCCTAGT 
ACACACTGCA 
TCCGGCTGTG 
ATGTGGCCTA 
ATTTGAAAGA 
GGACGTACAC 
TTTACTTCCT 
GCCCCAACCr 
TTATGATTTC 
TGCTGAAAGT 
ATGCCCTAAA 
ATGCAATCAC 



51 
I 

CACCAGCATC 
GGTGGCTGCC 
ACOGGGATGC 
GGCCCrCCAG 
CTACAGGCAC 
CATAGAGGAC 
CAGCAGCATC 
TTACAATGGG 
TGTTGACTGC 
TGOGTCTGTG 
GTGTTTTAGG 

ggagagtaag 
aggtttccca 



Seq ID NO: 549 Protein sequence 
Protein Accession #: NP_006774.1 

1 11 21 31 41 51 

I i I I I I 

MDWCTLHTFI GGVNKHSTSI GKVWITVIFI FRVMILWAA QEVWGDEQED FVCaiTLQPGC 
KNVCYDHPFP VSHIRLWALQ. LIFVSTPALIi VAMHVAYYRH ETTRKPRRGB KRNDFKDIED 
IKKHKVRIEG SLKWTYTSSI FPRIIFEAAP MWFYFLYNG YHLPWVLKOG IDPCPNLVDC 
FISRPTEKTV FTIFMISASV IGMLUJVAEL CVUiLKVCFR RSKRAQTQKM HPNHALKESK 
QNEMNELISD SGQNAITGFP S 

Seq ID NO: 550 DNA sequence 

Nucleic Acid Accession S: NM_002S7l.l 

Coding sequence s 99 . . 587 



CATCCCTCTG 
TCACCCTGGG 
AGGACCTGGA 
ACATCTCCCT 
CCACCCCGGA 
AGAAGAAGGT 
TGGOGAACGA 
AGGACACCAC 
AGGACGATGA 
GGTACTTGCT 
CCAGGAAGAC 
TTTCAAAGAA 
TCCTGCTGCA 
GCAGAGGTTA 



11 
I 

GCTCCAGAGC 
CGTGGCCCTG 
GCTCCCAAAG 
CATGGCGACA 
GGACAACCTG 
CCTTGGAGAG 
GGCCACGCTG 
CACCCCCATC 
GATCATGCAG 
GGACTTGAAA 
CAGACTCCCA 
TAACCACAGC 
CACXnX^CACC 
TTAATAAACC 



21 

I 

TCAGAGCCAC 
GTCTGTGGTG 
TTGGCAGGGA 
CTGAAGGCCC 
GAGATGGTTC 
AAGACTGGGA 
CTCGATACTG 
CAGAGCATGA 
GGATTCATCA 
CAGATGGAAG 
CCCTTCCACA 
TCAGAAGACXJ 
ATTGCCATGG 
CTTGGAGCAT 



31 
I 

CCACAGCCGC 
TCCCGGCCAT 
CCTGGCACTC 
CTCTGAGGGT 
TGCACAGATG 
ATCCAAAGAA 
ACTACGACAA 
TGTGCCAGTA 
GGGCTTTCAG 
AGCCGTGCCG 
GCTCCAGAGC 
ATGAQGTGGT 
GGAGGCTGCT 
G 



41 
I 

AGCCATGCTG 
GGACATCCCC 
CATGGCCATG 
CCACATCACC 
GGAGAACAAC 
GTTCAAGATC 
TTTCCTGTTT 
CCTGGCCAGA 
GCCCCTGCCC 
TTTCTAGCTC 
AGTGGGACTT 
CATCTGTGTC 
CCCTGGGGGC 



51 
I 

TGCCTCCTGC 
CAGACCAAGC 
GCGACCAACA 
TCACTGTTGC 
AGCTGTGTTG 
AACTATACGG 
CTCTGCCTAC 
GTCCTGGTGG 
AGGCACCTAT 
ACCTCOGCCT 
CCTCCTGCCC 
GCCATCCCCT 
AGAGTCTCTG 



Seq ID NO: SSI Protein sequence 
Protein Accession S: NP_002 562.1 

1 11 21 31 41 51 

] I ) i I I 

MDIPOTKODL ELPKLAGTWH SMAMATNNIS LTdATLKAPLR VHITSLLPTP EDNLEIVLHR 
WENNSCVEKK VLGEKTGNPK KFKINYTVAN EATLLDTDYD NFLFLCIiQDT TTPIQSMMCQ 
YLARVLVEDD EIHQGFIRAF RPLPRHLUyL U3UCQMEEPC RF 

Seq ID NO J 552 DNA sequence 

Nucleic Acid Accession #: NM_006S00.1 

Coding sequence: 2 7.. 1967 



1 

1 

ACTTGOQTCT 
TOGCCGCCTG 
CGCCTGAGCT 
AGTCCCAAGG 
TCATCTTCOG 
TCAGCCTCCA 
GCATCTTCTT 
TCTACAAAGC 
GTAAGGAGCC 
TCATCTGGTA 
CGTCCCAGAC 
TGGTTAAAGA 
GGAACCACAT 
TGTGGCTOGA 
GTTTGGCTGA 
GGGA6GCAGA 
AGGAACACAG 
TGAGTGAACC 
- CCCCT6A6AG 
ACCTOGAGTT 



11 
I 

0GCCCTCCG6 
CTGCTGCTGT 
GGTGGAGGTG 
CAACCTCAGC 
TGTGOGCCAG 
GGACAGAGGG 
GTGCCAG6GC 
TCCGGAGGAG 
TGAGGAGGTC 
CAAGAATG6C 
TGTGGAGTCG 
AGACAAAGAT 
GAAGGAGTCC 
AGTGGAGCCC 
TGGCAACCCT 
GGAAGAGACA 
TGGGCGCTAT 
ACAGGAACTA 
ACAGGAAGGC 
CCAGTGGCTO 



21 
I 

CCAAGCATGG 
OCTCGOGTOG 
GAAGTGGGCA 
CATGTCGACT 
GGCCAGGGCC 
GCTACTCTGG 
AAGOGCCCrC 
CX3VAACATCC 
GCIACCTQTO 
CGGCCTCTGA 
AGTGGTTTGT 
GCCCAGTTTT 
AGGGAAGTCA 
GTGGGAAT6C 
CCACCACACT 
ACCAACGACA 
GAATGTCAGG 
CTGGTGAACT 
AGCAGOCTCA 
AGAGAAGAGA 



31 
I 

GGCTTCCCAG 
GGGGTGTGOC 
GCACAGCCCT 
GGTrTTCTGT 
AGAGCGAACC 
CCCTGACTCA 
GGTCCCAGGA 
AGGTCAACCC 
TAGGGAGGAA 
AGGAGGAGAA 
ACACCTTGCA 
ACTGTGAGCT 
CCGTCCCTGT 
TGAAG6AAGG 
TCAGCATCAG 
AOGGGGTCCT 
CCTG6AACTT 
ATGTGTCTGA 
CCCTGAOCTG 
CAGACCAGGT 



41 
I 

GCTGGTCTQC 
0G6AGAG6CT 
TCTGAAGTGC 
CCACAAGGAG 
TGGGGAGTAC 
AGTCACCCCC 
GTACOGCATC 
CCTGGGCATC 
0Q6GTACCCC 
GAACGGGGTC 
GAGTATTCTG 
CAACTACCGG 
TTTCTACCCX} 
GGACOGGGTG 
CAAGCA6AAC 
GGTGCTGGAG 
GGACACCA7G 
GGTCCGAGTG 
1GAGGCAGAG 
GCTGGAAAGG 



51 
1 

GCCTTCTTGC 
GAGCAGCCTG 
GGCCTCTCCC 
AAGOGGACGC 
GAGCAGCGGC 
CAAGACGAGC 
CAGCTCCGCG 
CCTGTGAACA 
ATTCCTCAAG 
CACATTCAGT 
AAGGCACAGC 
CTGCCCAGTG 
ACAGAAAAAG 
GAAATCAGGT 
CCCAGCACCA 
CCTGCCCGGA 
ATATCGCTGC 
AGTCCCGCAG 
AGTAGCCAGG 
GGGCXTGT6C 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
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TTCAGTTGCA T6ACCTGAAA OGGGAGGCAG GAGGCGGCTA i mri COSTG GCX3TCTGTGC X260 

OCAGCATACC QG6CCTGAAC OSCACACAGC TGCTCAAGCT GGOCRTTTTT GGCCCCCCTT 1320 

GGATGGCATT CAAGGAGAGG AAGGTGTGGG TGAAAGAGAA TATGCrTGTTG AATCTGTCTT 1380 

GTGAAGCGTC ACX3GCAOCCC OGGCCCACCA TCTCCTGGAA CGTOVACGGC ACGGCAAGTG 1440 

AACAAGACCA ACSATCCACAG OGAGTCCTGA GCACCCTGAA TGTCCTOSTG ACCCCGGAGC X500 

TGTTOGAGAC AQGTGTTGAA TQCAOGGCCT OCAAOGACCT GGGCAAAAAC ACCAGCATCC 1560 

TCTTCCrGGA GCTQGTCAAT TTAACCAOCC TCACACCAGA CTOCAACACA ACCACTGGCC 1620 

TCAGCACTTC CACTGOCAGT CCTCA7ACCA GAGCCAACAG CAOCTCCACA GAGAGAAAGC 1680 

TGCCGGAGCC GGAGAGCOGG GGCGTGGTCA TCGTGGCTGT GATTGTGTGC ATCCTGGTCC 1740 

TGGaSGTGCT GGGCGCTGTC CTCTATrrCC TCTATAAGAA QQGCAAGCTO CCGTGCAGGC 1800 

GCTCAGGGAA GCAGGAGATC ACGCTGCCCC CGTCTCGTAA GACCGAACTT GTAGTTGAAG 1860 

TTAAGTCAGA TAAGCTCCCA GAAGAGATGG GCCTCCTGCA GGGCAGCAGC GGTGACAAGA 1920 

GGGCTCOGGG AGACCAGGGA GAGAAATACA TOGATCTCAG GCATTAGCXX CGAATCACTT 1980 

CAGCTCCCTT CCCTCCXTtCG ACCATTCCCA GCTCCCTGCT CACTCTTCTC TCAGCCAAAG 2040 

CCTCCAAAGG GACTAGAGAG AAGCCTCCTG CTCCCCTCAC CTGCACACCC CCTTTCAGAG 2100 

GGCCACTGGG TTAGGACCTG AGGACCTCAC TTGGCCCTGC AAGCCGCTTT TCAGGGACCA 2160 

GTCCACCACC ATCTCCTCCA CGTTGAGTGA AGCTCATCCC AAGCAAGGAG CCCCAGTCTC 2220 

CCGAGGGGGT AGGAGAGTTT CTTGCAGAAC GWlTlil'i'C TTTACACACA TrATGGCTCT 2280 

AAATACCTGG CTCCTGCCAG CAGCTGAGCT GGCSTAGOCTC TCTGAGCTGG TTTCCTGCCC 2340 

CAAAGGCTGG CTTCCACCAT CCACGTGCAC CACTOAACTG AC3GACACACC GGAGCCAGGC 2400 

GOCTGCTCAT GTTGAAGTGC GCTGTTCACA OCXXSCTCCGG AGAGCACCCC AG0C3GCATCC 2460 

AGAAGCAGCT GCAGTGTTGC TGCCACCACJC CTCCTGCTCG CCTCTTCAAA GTCTCCTGTG 2520 

ACATTTTTTC TTT6GTCAGA AGCCAGGAAC TGGTGTCATT CCTTAAAAGA TACGT6C0GG 2 580 

GGCCAGGTGT GGTOGCICAC GCCTGTAATC CCAGCACTTT GGGAGGCCGA GGCGGGCGGA 2640 

TCACAAAGTC ftOGACGAGAC CATCCTGGCT AACAGGGTGA AACCCTGTCT CTACTAAAAA 2700 

TACAAAAAAA AATTAGCTAG GCGTAGTGGT TOGCACCTAT AGTCCTAGCT ACTOGGAACG 2760 

CTGAAGCAGG AGAATGGTAT GAATOC3U5GA QGTGGAGCTT GCAGTGAGCC GAGACGGTGC 2820 

• CACTGCACTC CAGCCTCGGC AACACA(5CXSA GSCTaXTTCT CGAGGAAAAA AAAAGAAAAG 2880 

ACGCGTACCT GCGGTGAGGA AGCTGGGCGC TGITTTOGAG TTCAGGTGAA TTAGCCTCAA 2940 

TCCCOGTGTT CACTTGCTCC CATAGCCCrC TTGATGGATC ACX3TAAAACT GAAAGGCAGC 3000 

GGGGAGCAGA CAAAGAT6AG GTCTACACTG TCCTTCATGG GGATTAAAGC TATGGTTATA 3060 

TTAGCACCAA ACTTCTACAA ACCAAGCTCA GGGCCCCAAC CCTAGAAGGG CCCAAATGAG 3120 

AGAATCGTAC TTAGGGATGG AAAAOSGGGC CTGGCTAGAG CTTCGGGIGT GTGTGTCTGT 3180 

CTGTGTGTAT GCATACATAT GTGTGTATAT ATGGTTTTGT CAG6TGT6TA AATTTGCAAA 3240 

TTGTTTCCTT TATATATGTA TGTATATATA TATATGAAAA TATATATATA TATGAAAAAT 3300 

AAAGCTTAAT TGTCXTCAGAA AATCATACAT TGCTTTTTTA TTCTACATGG GTACCACAGG 3360 

AACCTGGGGG CCTGTGAAAC TACAACCAAA AGGCACACAA AACCGTTTCC AGTTGGCAGC 3420 

AGAGATCAGG GGTTACCTCT GCTTCTGAGC AAATGGCTCA AGCTCTACCA GAGCAGACAG 3480 

CTACCCTACT TTTCAGCAGC AAAACGTCCC GTATGACGCA GCACGAAGGG CCTGGCAGGC 3540 
TGTTAGCAGG AGCTATGTCC CTTCCTATCG TTTCCGTCCA CTT 

seq ZD MO: 553 Protein sequence 
Protein Accession S: NP_006491.l 

1 11 21 31 41 51 

I i I I I i 

GLPRLVCAFL LAACCCCPRV AGVPGEAEQP APELVEVEVG STALLKOGLS QSQGNIiSHVO 60 

WFSVHKEKRT LIFRVRQGQG QSSPGEYEQR LSUJDRGATL ALTQVTPQDE RIFLOQGKRP 120 

RSQEYRIQLR VYKAPEEPNI QVNPLGIPVN SKEPEEVATC VGRNGVPIPQ VIWYKNGRPL 180 

KEEKNRVKIQ SSQTVESSGb YTIiQSILKAQ LVKEDKDAQF YCBLNYRLPS GNHMKESREV 240 

TVPVFYPTEK VWLEVEPVGM LKBGDRVEIR CLADGNPPPH FSISKQNPST REAEEETTND 300 

NGVLVLEPAR KEHSGRYBOQ AWNLDTMISL LSBPQELLVN YVSDVRVSPA APERQEGSSL 360 

TIiTCEABSSQ DLEEQWLREB TDOVLERGPV LQLHDLKRBA GGGVRCVASV PSIFGLMRTQ 420 

LVKLAIFGPP WMAFKERKVW VKENMVLNLS CEASGHPRPT ISWNV NGTAS EQDQDPQRVL 480 

STLNVLVTPE LLETGVECTA SNDLGKNTSI LFLBLVNLTT LTPDSNTTTG L5TSTASPHT 540 

RANSTSTERK LPEPESRGW IVAVIVCILV LAVLGAVLYF LYKKGKLPCR RSGKQEITLP 600 
PSRKTBI>WE VKSDKLPEEM GLLQGSSGDK HAPGDQGEIW IDLRH 

Seq ID no I 554 DNA sequence 

Nucleic Acid Accession S : NM_003183.3 

Coding sequence: 165.. 263 9 

1 H 21 31 41 51 

I i I I 1 1 

TOSAGCCTGG CGGTAGAATC TTCCCAGTAG GCGGOGCGGG AGGGAAAAGA GGATTGAGGG 60 

GCTAGGCCGG GCGGATCCCG TCCTCCCCOG ATGTGAGCAG TTTTOOGAAA CCCCGTCAGG 120 

OGAAGGCTGC CCACAGAGGT GGAGTOGGTA GOGGGGCOGG GAACATGAGG CAGTCTCTCC 180 

TATTCCTGAC CAGCGTGGTT CCTTTOGTGC TGGCGCCGCG ACCTCCGGAT GACCCGGGCT 240 

TCGGCCCCCA CX31GAGACTC GAGAAGCTTG ATTCTTTGCT CTCAGACTAC GATATTCTCT 300 

CTTTATCTAA TATCCAGCAG CATTCGGTAA GAAAAAGAGA TCTACA6ACT TCAACACATG 360 

TAGAAACACT ACTAACTTTP TCAGCTTT6A AAAGGCATTT TAAATTATAC CT6ACATCAA 420 

GTACTCAACG* TTTTTCACAA AATTTCAAGG TCGTGGTGGT GGATGGTAAA AACGAAAGCG 480 

AGTACACTGC AAAATGGCAG GACTTCTTCA CTGGACACGT GGTTGGTGAG CCTGACTCTA 540 

GGGTTCTAGC CCACATAAGA GATGATGATG TTATAATCAG AATCAACACA GATGGGGCCG 600 

AATATAACAT AGAGCCACTT TCGAGATTTG TTAATGATAC CAAAGACAAA AGAATGTTAG 660 

TTTATAAATC TGAAGATATC AAGAATGTTT CACGTTTGCA GTCTCCAAAA GTGTGTGGTT 720 

ATTTAAAAGT GGATAATGAA GAGTTGCTCC CAAAAGGGTT AGTAGACAGA GAACCACCTG 780 

AAGAGCTTGT TCATCGAGTG AAAAGAA6AG CIGACCCAGA TCCCATGAAG AACACGTGTA 840 

AATTATTGOT GGTAGCAGAT CATCGCTTCT ACAGATACAT GGGCAGAGGG GAAGA6AGTA 900 

CAACTACAAA TTACTTAATA GAGCTAATTG ACAGAGTTGA TGACATCTAT OGGAACACTT 960 

CA7GGGATAA TGCAGGTTTT AAAGGCTATG GAATACAGAT AGAGCAGATT CGCATTCTCA 1020 

AGTCTCCACA AGAGGTAAAA CCTGGTGAAA AGCACTACAA CATGGCAAAA AGTTACCCAA 1080 

ATGAAGAAAA GGATGCTTGG GATGTGAAGA TGTTGCTAGA GCAATTTAGC TTTGATATAG 1140 

CTGAGGAAGC ATCTAAAGTT TGCTTGGCAC ACCTTTTCAC ATACCAAGAT TTT GATATGG 1200 

GAACTCTTGG ATTAGCTTAT GTTGGCTCTC CCAGAGCAAA CAGCCATGGA GGTGTTTGTC 1260 

CAAAGGCTTA TTATAGCCCA GTTGGGAAGA AAAATATCTA TTTGAATAGT GGTTTGAOGA 1320 

GCACAAAGAA TTATGGTAAA ACCATCCTTA CAAAGGAAGC TGACCTGGTT ACAACTCATG 1380 
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AATTQC3GACA TAATTTTCGA GCAGAACATG ATCOGGATQG TCTAGCAGAA TGTGCCCOGA 1440 

ATGAGGACCA GGGAGGGAAA TATGTCATGT ATCCCATAGC TGTGAGTGGC GATCAGGAGA 1500 

ACAATAAGAT GTTTTCAAAC TGCAGTAAAC AATCAATCTA TAACACCATT GAAAGTAAGG 1S60 

CCXAGGAGTG TTTTCAAGAA CGCAGCAATA AAGTTTGTGG GAACTOGAGG GTGGATGAAG 1620 

GAGAAGAGTG TGATCCTGGC ATCATGTATC TGAACAAOGA CAOCTGCTGC AACAGOGACT 1680 

GCACGTIGAA GGAAGGTGTC C31GTGC3VGTG ACAGGAACAG TCCITGCTGT AAAAACTGTC 1740 

ACnriGAGAC TGCCCAGAAG AAGTGCCAGG AGGOGATTAA TGCTACTTGC AAAGGOGTCT 1800 

CCTACTGCAC AGGTAATAGC AGTGAGTGCC CGCCTCCAGG AAATGCTGAA AATGACACTG 1860 

TrreCTTGGA TCTTGGCAAG TGTAAGGATG GGAAATGCAT CCCTTTCTGC GAGAGGGAAC 1920 
AGCAGCTGGA 6TCCTOTGCA TGTAATGAAA CTGACAACTC CTGCAAGGTG TGCTGC AGGG 
ACCrrrCTGG COGCrG'i'OTG OCCTATGTOG ATGCTGAACA AAA6AACTTA TTTTTGAGGA 
AAGGAAAGCC CTGTACAGTA GGATTTTGTG ACATGAATGG CAAATGTGAG A AAOGAG TAC 

AGGATGTAAT TGAAOGATTT TGGGATTTCA TTGACCAGCT GAGCATCAAT ACiTT TGGAA 2160 

AGTTTTTAGC AGACAACATC GTTGGGTCTG TCCTGGTTTT CTCCTTGATA TrTTGGATTC 2220 

CTTTCAGCAT TCTTGTCCAT TGTGTGGATA AGAAATTGGA TAAACAGTAT GAATCTCTGT 2280 

CTCTGTTTCA CCCCAGTAAC GTCGAAATGC TGAGCAGCAT GGATTCTGCA TCGGTTCGCA 2340 

TTATCAAACC CTTTCCTGOG OCCCAGACTC CAGGCOGCCT GCAGCCTGCC CCTGTGATCC 2400 

CTTCGGCXSCC AGCAGCTCCA AAACTGGACC ACXAGAGAAT GGACACCATC CAGGAAGACC 2460 

CCAGCACAGA CTCCCATATG GACGAGGATG GGTTTGAGAA GGACOCCTTC CCAAATAGCA 2S20 

GCACAGCTGC CAAGTCATTT GAGGATCTCA OGGACCATCC GGTCGCCAGR AGTGAAAAGG 2580 

CTGCCTCCTT TAAACTCCAG CGTCAGAATC GTGTTAACAG CAAA6AAACA GAGTGCTAAT 2640 

TTACTTTCTCA GCTCTTCTGA CTTAAGTGTG CAAAATATTT TTATAGATTT GACCTACAAA 2700 

TCAATCACAG CTTCTATTTT GTGAAGACTG GGAAGTGACT TAGCAGATGC TGGTCATGTG 2760 

TTTCAACTTC CTGCAGGTAA ACACTTCTTG TGTGGTTTGG CCCT TCTCCT TTTGAAAAGG 2820 

TAAGGTGAAA GTGAATCTAC TTATTTTGAG GCTTTC A GGT TTTAGTTTrT AAAATATCTT 2880 

TTGACCTGTG GTGCAAAAGC AGAAAATACA GCTGGATTGG GTTATGAATA TTTACGTTTT 2940 

TGTAAATTAA TCTTTTATAT TGATAACAGC ACTCACTAGG GAAATGATCA GTTTTTTTTT 3000 

ATACACTGTA ATGAACCGCT GAATATGAAG CATTTGGCAT TTATTTGTGA GAAAAGTGGA- 3060 

ATAGTTTTTT ■ rrmT i' -t TT TTTTTTTTGC CTTCAACTAA AAACAAAGGA GATA AATTTA 3120 

GTATACATTC TATCTAAATT GTGGGTCTAT TTCTAGTTAT TACCCAGAGT TTTTATGTAG 3180 

CAGGGAAAAT ATATATCTAA ATTTAGAAAT CATTTGGGTT AATATGGCTC TTCATAATTC 3240 

TAAGACTAAT GCTCAGAACC TAACCACTAC CTTACAGTGA GGGCTATACA TGGTAG^AG 3300 

TTGAATTTAT GGAATCTACC AACTGTTTAG GGCCX^XSATT TGCTGGGCAG TTTTTCTGTA 3360 

TTTTATAAGT ATCTTCATGT ATCCCTGTTA CTGATAGGGA TACATGTCTT AGAAA ATTCA 3420 

CTATTGGCTG GGAGTGGTGG CTCATGCCTG TAATCCCAGC ACTTGGAGAG GCTGAGGTTG 3480 
G6CCACTACA CTCCAGCCTG GGTGACAGAG TGAGATCTGC CTC 

Seq lO NOt 555 Protein sequence 
Protein Accession S: NP_003174.2 

1 11 21 31 41 51 

I I I I i 1 

MRQSLLFLTS WPPVIAPRP PDDPGPGPHQ RLKKLDSLLS DYDILSLSNI QQHSVRKRDL 60 

QTSTHVETLL TPSALKRHFK LYLTSSTERF SQNPKWWD GiQIBSEYTAK WQDPPTGHW 120 

GEPDSRVLAH IRDDDVIIRI NTDGABYNIB PLMRPVNDTK DKRMLVYKSB DlKMVSRIflS 180 

PKVCGYLKVD NEELLPKGLV DREPPEELVH RVKRRADPDP MKNTCKLLW ADHRPYRYKG 240 

RGEESrmfY LIELIDRVDD lYRNTSWDNA GFKGYGIQIE QIRILKSPQE VKPGEKHYNM 300 

AKSYPNEEKD AWDVKMLLBQ PSFDIAEEAS KVCIiAHLFTY QDPDMGTIX5L AYVGSPRANS 360 

HGGVCPKAYY SPVGKKNIYI. NSGLTSTKNY GKTILTKEAD LVTTHELGHN FGAEHDPDGL 420 

AECAPMEDQG GKYVMYPIAV SGDHENNKMP SNCSKQSIYK TIESKAQECP QERSNKVCC2J 480 

SRVDEGEECD PGIMYLNNDT CCNSDCTLKE GVQCSDRNSP CCKNCQPETA QKKCQEAINA 540 

TCKGVSYCTG NSSECPPPGN AENDTVCLDL GKCKDGKCIP FCEREQQLES CACNETDNSC 600 

KVCCROLSGR CVPYVDAEQK NLFLRKGKPC TVGFCDMNGK CEKRVQDVIE RFHDFIOOLS 660 

INTPGKFIAD NIVGSVLVFS LIFWIPFSIL VHCVDKKLDK QYESLSLFHP SNVEMLSSMD 720 

SASVRIIKPF PAPQTPGRLQ PAPVIPSAPA APKLDKQRMD TIQEDPSTDS HKDEDGFEKD 780 
PFPNSSTAAK SPBDLTOHPV ARSEKAASFK LQRQNRVHSK ETEC 

Seq ID NO: 556 DNA sequence 

Nucleic Acid Accession 9: NM_021832.1 

Goding sequence: 164.. 2248 

1 11 21 31 41 51 

TCGAGCCTGO CGGTAGAATC TTCCCAGTAG G0GG0G06GG AGGAAAAGAG GATTGAGGGG 60 

CTAGGCCGGG CGGATCCOGT CCTCCCCOSA TGTGAGCAGT TTTCCGAAAC CCOGTCAGGC 120 

GAAGGCTGCC CAGAGAGGTG GAGTOGGTAG CGGGGCCGGG AACATGAGGC AG TCTC TCCT 160 

ATTCCTGACC AGOSTGGTTC CTTTCGTGCT GGOGCCGCGA CCTCCGGAtG ACCC3GGGCTT 240 

CGGCCCCCAC CAGAGACTCG AGAAGCTTGA TTCTTTGCTC TCAGACTAOG ATATTCTCTC 300 

TTTATCTAAT ATCCAGCAGC ATTCGGTAAG AAAAAGAGAT CTACAGACTT CAACACATGT 360 

AGAAACACTA CTAACTTTTT CAGCTTTGAA AAGGCATTTT AAATTATACC TGACATCAAG 420 

TACTGAACGT TTTTCACAAA ATTTCAAGGT OGTGGTGGTG GATGGTAAAA AG6AAAG0GA 460 

GTACACTGTA AAATGGCAGG ACTTCTTCAC TGGACACGTG GTTGGTGAGC CTGACTCTAG 540 

GGTTCTAGCC CACATAAGAG ATGATGATGT TATAATCAGA ATCAACACAG ATGGGGCCGA 600 

ATATAACATA GAGCCACTTT GGAGATTTGT TAATGATACC AAAGACAAAA GAATGTTAGT 660 

TTATAAATCT GAAGATATCA AGAATGTTTC AOGTTTGCAG TCTCCAAAAG TGTGTGGTTA 720 

TTTAAAAGTG GATAATGAAG AGTTGCTCOC AAAAGGGTTA GTAGACAGAG AACCACCTGA 780 

AGAGCTT G TT CATOSAGTGA AAAGAAGAGC T6ACCCAGAT CCCATGAAGA ACACXnX3TAA 840 

ATTATTGGTG GTAGCAGATC ATCGCTTCTA CAGATACATG GGCAGAGGGG AAGA6AGTAC 900 

AACTACAAAT TACTTAATAG AGCTAATTGA CAGAGTTGAT GACATCTATC G6AACACTTC 960 

ATGGGATAAT GCAGGTTTTA AAGGCTATGG AATACAGATA GAGCAGATTC 6CATTCTCAA 1020 

GTCTCCACAA GAGGTAAAAC CTGGTGAAAA GCACTACAAC ATGGCAAAAA GTTACCCAAA 1080 

TGAAGAAAAG GATGCTTGGG ATGTGAAGAT GTTGCTAGAG CAATTTAGCT TTGATATAGC 1140 

TGAGGAAGCA TCrAAAGTTT GCTTGGCACA CCTTTTCACA TACCAAGATT TTGATATGGG 1200 

AACTCTTGGA TTAGCTTAT6 TTGGCTCTOC CAGAGCAAAC AGCCATGGAG GTGTTTGTCC 1260 

AAAGGCTTAT TATAGCCCAG TTGGGAAGAA AAATATCTAT TTGAATAGTG GTTTGA06AG 1320 

CACAAAGAAT TATGGTAAAA CCATCCTTAC AAAGGAAGCT GACCTGGTTA CAACTCATGA 1360 

ATTGGGACAT AATTTTGGAG CAORACATGA TCOGGATGGT CTAGCAGAAT GTGCCCCGAA 1440 
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TGflCGAOCAG GGAGGGAAAT ATGTCATGTA TCOCaTAGCT GTGAGTGGCG ATCACGAGAA 1500 

CAATAAGATC TTTTCAAACT GCAGTAAACA ATCAATCTAT AAGACCATTG AAAGTAAGGC 1S60 

CCAGGAG1GT TTTCMGAAC GCAGCAATAA AGmOIGGG AACTGGAGG6 TGGATGAAOG 1620 

AGAAi»GTGT GATCCTGGCA TC31TGTATCT GAACAAGGAC ACCTCCTGCA ACAGOGACTG 1680 

CACGTTGAAG GAAGGTGTCC AGTGCAGTGA CAGSAACAGT CCTTGCTGTA AAAACTGTCA 1740 

GrrTGAGACT GCCCAGAAGA AGTGCCAGGA GGOGATTAAT GCTACTTGCA AAGGOGTGTC 1800 

CTACTGCACA GGTAATAGCA GTGAGTGCCC GCCTCCAGGA AATGCTGAAG ATGACACTGT 1860 

TTCCTTGGAT CTTGGCAAGT GTAAGGATGG GAAATGCATC CCTTTCTGOG AGAGGGAACA 1920 

GCAGCTGGAG TCCTGTGCAT GTAATGAAAC TGACAACTCC TGCAAGGTGT GCTGC AGGGA 1980 

CCTTTCOSGC CGCTGTGTGC CCTATGrOai TGCTGAACAA AAOAACTTAT TTrTGAGGAA 2040 

AGGAAAGCCC TGTACAGTAG GATTTTGTGA CATGAATOGC AAATCTGAGA AAOGflGTA CA 2100 

GGATGTAATT GAAOGATTTT GGGATTTCAT TGACCAGCTG AGCATCAATA CTl'l CGGAAA 2160 

GTTTTTAGCA GACAACATCG TTGGGTCTGT CCTGGTTTTC TCCTPGATAT TTTGGATTCC 2220 

TTTCAGCATT CTTGTCCATT GTGTGTAACG TCGAAATGCT GAGCAGCATG GATTCTGCAT 2280 

OGGTTOGCAT TATCAAACCX; TTTCCTGCGC CCCAGACTCC AGGCCGCCTG CAGCCTGCCC 2340 

CIGTGATCOC TTCGGOGCCA GCAGCTCCAA AACTGGACCA OCAGAGAATG GA CACCAT CC 2400 

AGGAAGACCC CAGCACAGAC TCACATATGG AGGAGGATGG GTTTGAGAAfi GAOOCCTTCC 2460 

CAAATAGCAG CACAGCTGCC AAGTCATTTG AGGATCTCAC GGACCATCCX5 GTCACCAGAA 2520 

GTGAAAAGGC TGCCTCCTTT AAACTGCAGC GTCAGAATOG TGTTG ACRGC AAAGAAACAG 2580 

AGTGCTAATT TAGTTCTCAG CTCTTCTGAC TTAAGTGTGC AAAATATTTT TATAGATTTG 2640 

ACCTACAATC AATCACAiGCT TATATTTTGT GAAGACTGGG AAGTGACTTA GCAGAT GCTG 2700 

CTCATGTCTT TCAACTTCCT 6CRGGTAAAC AGTPCTTGTG TGGTTTGGCC CTT CTCCTT T 2760 

TCAAAAGGTA ACGTGAAGGT GAATCTAGCT TATTTTGAGG CTTTCAGGTT TPAGTTTTTA 2820 

AAATATCTTT TGACCTGTGG TGCAAAAGCA GAAAATACAG CTOGATTOQC TTATGAGTAT 2880 

TTAOGTTTTT GTAAATTAAT CTTTTATArP GATAACAGGC ACTGACTAQG GA AATGA TCA 2940 

GTTTTTTTTT ATACACTGTA ATGAACCGCT GAATATGAAG CATTTGGCAT TTATTTGTGA 3000 
GAAAAGTCGA ATAGTrTTTT Trrm T T l l TTTTTTTTGC CTTCAACTAA AAACAAAGGA. 3060 

GATAAATTTA GTATACATTG TATCTAAATT GTGGGTCTAT TTCTAGTTAT TACCCAGAGT 3120 

TTTTATGTAG CAGGGAAAAT ATATATCTAA ATTTAGAAAT CATTTGGGTT AATATGGCTC 3180 

TTCATAATTC TAAGACTAAT GCTCAGAACC TAACCACTAC CTTACAGTGA GGGCTATACA 3240 

TGGTAGCCAG TTGAATTTAT GGAATCTACC AACTGTTTAG GGCCCTGATT TGCTGGGCAG 3300 

TTTTTCTCTA TTTTATAAGT ATCTTCATGT ATCCCTGTTA CTGATAGGGA TACATGTCTT 3360 

AGAAAATTCA CTATTGGCTG GGAGTGGTGG CTCATGCCTG TAATCCCAGC ACTTGGAGAG 3420 
3421 GCTCAGGTTG CX3CCACPACA CTCCAGCCTG GGTGACAGAG TGAGATCIGC CTC 

8eq ID NO: 557 Protein sequence 
Protein Accession #: NP_0 68 604,1 

1 11 21 31 41 SI 

ilRQSLLFliTS WPFVLAPRP PDDPGFGPHQ HLEKLDSLLS DYDILSLSNI QQHSVRKSDL 60 

QTSTHVETLL TPSALKRHFK LYLTSSTERF SQNFiWWVD GKKESEYTVK SIQDFFTGHW 120 

GEPDSRVIAH IROTDVIIRI NTDGAEYNIE PLWRFVNDTK DKRMI.VYKSE DIKNVSRLQS 180 

PKVCGYLKVD NEELLPKGLV DREPPEELVH RViOlRADPDP MKNTCKLLW ADHRFYRYMG 240 

RGEESTTTNY LIEI^IDRVDD lYRNTSWDNA GFKGYGIQIE QIRILKSPQB VKPGEKHYNM 300 

AKSYPMEEKD AWDVKMLLEQ FSPDIABEAS KVCLAHLFTY QDFDMGTIXjIi AYVGSPRANS 360 

HGGVCPKAyy SPVGKKMIYL NSGLTSTKNY GKTILTKEAD LVTTHBLGHN FGAEHDPDGL 420 

AECAPHEDQG GKYVMYPIAV SGDHENNKMP SNCSKQSIYK TIESKAQECP QERSNKVCGN 480 

SRVDEGEECD PGIMYLNNDT CCNSDCTIiKE GVQCSDRNSP CCKNCQFBTA QKKCQEAINA 540 

TCKGVSYCTG NSSECPPP6N AEDDTVCLDL GKCKDGRCIP FCEREQQLBS CACNBTDMSC ' 600 

KVCCRDLSGR CVPYVOAEQK NLPLRKCKPC TVGFCDMSGK CEKRVQDVIE RFWDFIDQLS 660 
INTFGKFLAD NIVGSVLVPS LIFWIPPSIL VHCV 

Seq ID NO: 558 DMA sequence 

Nucleic Acid Accession ft : NM_004994 . 1 

Coding sequence: 20.. 2143 

1 11 21 31 41 51 

I I I I I I 

AGACACCTCT GOCCTCACCA TGAGCCTCTG GCAGCCCCTG GTCCTGGTGC TCCTGGTGCT 60 

GGGCT6CTGC TTTGCTGCCC CCAGACAGCG CCAGTCCACC CTTGTGCTCT TCCCTGGAGA 120 

CCTGAGAACC AATCTCACCG ACAGGCAGCT GGCA6AGGAA TACCTGTACC GCTATGGTTA 180 

CACTCGGGTG GCAGAGATGC GTGGAGAGTC GAAATCTCTG GGGCCTGCXSC TGCTGCTTCT 240 

CCAGAAGCAA CTGTCCCTGC CCGAGACCGG TGAGCTGGAT AGCGCCACGC TGAAGGCCAT 300 

GCGAACCCCA CXXJTGCGGGG TCCCAGACCT GGGCAGATTC CAAACCTTTG AGGGCGACCT 360 

CAAGTGGCAC CACCACAACA TCACCTATTG GATCCAAAAC TACTCGGAAG ACTTGCCGCG 420 

GGCX3GTGATT GACGAOGCCT TTGCCCGOGC CTTCGCACTG TGGAG CGCGG TGACGCCGCT 480 

CACCTTCACT GGGGTGTACA GCCGGGAGGC AGACATCGTC ATCCAGTTTG GTGTCGOGGA 540 

GCAGGGAGAC GGGTATCCCT TCGACGGGAA GGACG6GCTC CTGGCACAC6 CCTTTCCTCC 600 

TGGCCCCGGC ATTCAGGGAG ACGCCCATTT CGACGATGAC GAGTTGTGGT CCCTGGGCAA 660 

GGGCGTCGTG GTTCCAACTC GGTTTGGAAA CGCAGATGGC GOGGCCTGCC ACTTCCCCTT 720 

CATCTTCGAG GGC06CTCCT ACTCTGCCTG CACCACCGAC GGTCGCTCCX5 ACX^GCTTGCC 780 

CTGGTGCAQT ACCACGGCCA ACTACGACAC CGACGACCGG TTTGGCTTCT GCCCCAGCGA 840 

GAGACTCTAC AO0CX3GGACG GCAATGCTGA TGGGAAACCC TGCCAGTTTC CATTCATCTT 900 

CCAAGGCCAA TOCTACTCCG CCTGCACCAC GGACGGTCGC T0C3GACGGCT ACCXSCTGGTG 960 

CGCCACCACC GCCAACTAGG ACCGGGAC3VA GCTCTTOGGC TTCTGCCOGA CCCGAGCT6A 1020 

CTCGACGGTG ATGGGGGGCA ACTCGGOGGG GGAGCTGTGC GTCTTCCCCT TCACTTTCCT 1080 

GGGTAAGGAG TACTOGACCT GTACCAGOGA GGGCCGCGGA GATGGGCGCC TCTGGTGCXSC 1140 

TACCACCTCG AACTTTGACA GCGACAAGAA GTGGGGCTTC TGCCCGGACC AAGGATACAG 1200 

TTTGTTCCTC GTGGCGGCGC ATGAGTTOGG CCACGOGCTG GGCTTAGATC ATTCCTCAGT 1260 

GCOGGAGGCG CTCATGTACC CTATGTACCG CTTCACTGAG GGGCCCCCCT TGCATAAGGA 1320 

OGACGTGAAT GGCATCCGGC ACCTCTATGG TCCTCGCCCT GAACCTGAGC CACGGCCTCC 1380 

AACCACCACC ACACOGCAGC CCACGGCTCC CCCGACGGTC TGCCCCACCG GACCCTCCAC 1440 

TGTCCACCOC TCAGAGCGCC CXACAGCTGG CCXTCACAGGT CCCCCCTCAG CTGGCCCCAC 1500 

AGGTCCCCCC ACTGCTGGCC CTTCTACGGC CACTACTGTG CCTTTGAGTC 0GGTGQA06A 1560 

TGCCTGCAAC GTGAACATCT TOGACGCCAT C3GCGGAGATT GGGAACCAGC TGTATTTGTT 1620 

CAAGGATGGG AAGTACTGGC GATTCTCTGA GGGGAGGGGG AGCCGGCCGC AGGGCCCCTT 1680 
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OCTTATOGCC GACAAGTGGC COGOGCTGCC CaSCAAGCTG GACTOGGTCT TTGAGGAGOC 1740 

GCTCTCCAAG AAGCTTTTCT TCTTCTCTGG GOGOCAGGTG TGGGTGTACA, CAGGGGOGTC 1600 

GGTGCTG GG C COGAGGOGTC TGGACAAGCT GGG0CT6GGA GOCX«AOGTGG C0CAGGT6AC ISfiO 

CQOQGCGCTC OOGAGTOGOV 0G066AAGAT GC T GC TGYit ; AGGGGGGG6C GCCTCT G GAG 1920 

GTTaSAOnx; AAGGOGCAGA TGGTOSATCC CXXSGAGCGCX: AGCGAGGTGG ACCGGATGTT 1980 

CCCOSGGGTG CCnTGGACA CGCACGACGT CTTCCAGTAC OGAGAGAAAG CCTATTTCTG 2040 

CCAGGACCGC TTCTACTGGC GCGTGAGTTC CCGGAGTGAG TTGAACCAGG TGGACCAAGT 2100 

GGGCTACGTG ACCTATGACA TCCTGCAGTG CCCTGAGGAC TAGGGCTCCC GTCCTGCTTT 2160 

GCAGTGCCAT GTAAATCCCC ACTGGGACCA ACCCTGGGGA AGGAGCCAGT TTGCOSGATA 2220 

CAAACTGGTA TTCTGTTC TG G AGGAAAG GG AGGAGTGGAG GTOQGCTGQG CCCTCTCTTC 2280 
TCACCTTTGT TTTTTGTTGG AGTGTTTCTA ATAAACTTGG ATTCTCTAAC CTTT 

Seg ID NO: SS9 Protein sequence 
Protein Accession i: NP_00498S.l 

1 11 21 31 41 51 

I I I I I i 

KSLWQPLVLV LLVLGCXZFAA PRQRQSTLVL FPGD1*RTKLT ORQLAEEYLY RYGYTRVAEM €0 

RGESKSLGPA LLLLQKQLSL PETGELDSAT LKAMRTPRCX3 VPDLGRFQTF BGDLKWHHHN 120 

ITYWIONYSE DLPRAVIDDA FARAPALWSA VTPLTPTRVY SRDADIVIQP GVAEHGDGYP 180 

FDGKDGLLAH APPPGPGIQG DAHFDDDELW SLGKGWVPT RFGNADGAAC HPPFIPBGHS 240 

YSACTTDGRS DGLPWCSTTA NYDTDDRFGP CPSERLYTRD GNADGKPCQF PFIFQGQSYS 300 

ACTTDGRSDG YRWCATTAMY DRDKLP6FCP TRADSTVMGG NSAGELCVFP FTFLGXEYST 360 

CTSEGRGDQt LHCATTSNFD SOKKNGFCFD QGYSZiFLVAA HEFSKALGU) HSSVPEALKY 420 

PMYRPTE6PP LHKDDVNGIR HLYGPRPEPB PRPPTTTTPQ PTAPPTVCPT GPPTVHPSSR 480 

PTAGPTGPPS AGPTGPPTAG PSTATTVPLS PVDnAOfVNI FDAIAEIGNQ LYLFKDGKVW 540 

RPSBGRGSRP QGPFLIADKW PALPRKLDSV FEEPLSKKLF FFSGRQVWVY TGASVLGPRR 600 

LDKLGLGADV AQVTGALRSG RGKMUiFSGR RLNRFDVKAQ MVDPRSASBV DRMFFGVPLD 660 
THDVFQYREK AYFCQDRPyH RVSSRSELKQ VDQVGYVTYD XLQCPED 

Seq ID KO: 560 DMA sequence 

Nucleic Acid Accession 1^: NM_000213.1 

coding sequence: 127..538S 

1 11 21 31 41 51 

1 1 I t I i 

GGCCG6C3GOG CTGCAGOCCX: ATCTCCTAGC GGCAGCCCAG GCGCGGAGGG AGCX3AGT00G 60 

CCCGGAG6TA GGTCCAiGGAC GGGOGCACAG CAGCAGCOGA GGCTGGCCGG GAGAGGGAGG 120 

AAGAGGATGG CAGGGCCACG CCCCAGCCCA TGGGCCAGGC TGCTCCTGGC AGCCTTGATC 180 

AGCETCAGCC TCTCTGGGAC CTTGGCAAAC CGCTGCAAGA AGGCCCCAGT GAAGAGCTGC 240 

ACGGAGT6TG TCCGTGTGGA TAAGGACTGC GCCTACTGCA CAGACGAGAT GTTCAGGGAC 300 

CGGCGCTGCA ACACCCAGGC GGAGCTGCTG GCCG0G6GCT GCCA60GGGA GAGCATCGTG 360 

GTCATGGAGA GCAGCTTCCA AATCACAGAG GAGACCCAGA TTGAC3VCCAC CCTGOGGOGC 420 

AGCCAfiATGT CCCCCCAAGG CCTGCGGGTC CGTCTGCGGC COGGTGAGGA GOGGCATTTT 480 

GAGCTGGAGG TGTTTGAGCC ACTGGAGAGC CCCGTGGACC TGTACATCCT CATGGACTTC 540 

TCCAACrCCA TGTCCGATGA TCTGGACAAC CTCAAGAAGA TGGGGCAGAA CCTGGCTCGG 600 

GTCCTGAGCC AGCTCACCAG CX3ACTACACT ATTGGATTTG GCAAGTTTGT GGACAAAGTC 660 

AGOGTCCCGC AGACGGACAT GAGGCCTGAG AAGCTGAAGG AGCCCTGGCC CAACAGTGAC 720 

CCCCCCTTCT CCTTCAAGAA CGTCATCAGC CTGACAGAAG ATGTGGATGA GTTCCGGAAT 780 

AAACTGCAGG GAGAGGGGAT CTCAGGCAAC CTGGATGCTC CTGAGGG06G CTTCGATGCC 840 

ATCCT6CAGA CAGCTGTGTG CACGAGGGAC ATTGGCTGGC GCCCGGACAG CACCCACCT6 900 

CTGGTCTTCT CCACCGAGTC AGCCTTCCAC TATGAGGCTG AT6GCGCCAA CXSTGCTGGCT 960 

GGCATCATGA GCOGCAACGA TGAACGGTGC CACCTGGACA CCACGGGCAC CTACACCCAG 1020 

TACAGGACAC AGGACTACCC GTCGGTGCCC ACCCTGGTGC GCCTGCTOGC CAAGCACAAC 1080 

ATCATCCCCA TCrTTGCTGT CACCAACTAC TCCTATAGCT ACTAOGAGAA GCTTCACACC 1140 

TATTTCCCTG TCTCCTCACT GGGGGTGCTG CAGGAGGACT OSTCCAACAT GGTGGAGCTG 1200 

CTGGAGGAGG CCTTCAATCG GATCCGCTCC AACCTGGACA TCCGGGCCCT AGACAGCCCC 1260 

CGAGGCCTTC GGACAGAGGT CACCTCCAAG ATGTTCCAGA AGACGAGGAC TGGGTCCTTT 1320 

CACATCCX5GC GGGGGGAAGT GGGTATATAC CAGGTGCAGC TGOGGGCCCT TGAGCACGTG 1380 

GATGGGACGC ACGTGTGCCA GCTGCCGGAG GACCAGAAGG GCAACATCCA TCTGAAACCT 1440 

TCCTTCTCCG AOGGCCTCAA GATGGAOGOG GGCATCATCT GTGATGTGTG CACCTGCGAG 1500 

CTGCAAAAAG AGGT6C66TC AGCTCGCTGC AGCTTCAACG GAGACTTCGT GTGCGGACAG 1560 

TGTGTGTGCA 60GAGGGCTG GAGTGGOCAfi ACCTGCAACT GCTCCACCGG CTCTCTGAGT 1620 

GACATTCAGC CCT6CCT6CG GGAGGGC6AG GACAAGCCGT GCTCCGGCCG TGGGGAGTGC 1680 

CAGTGCGGGC ACTGTGTGTG CTACGGCGAA GGCCXKTPACXS AGGGTCAGTT CTGCGAGTAT 1740 

GACAACTTCC AGTGTCCCOG CACTTCCGGG TTCCTCTGCA ATGACCGAGG ACGCTGCTCC 1800 

ATGGGCCAGT GTGTGTGTGA GCCTGGTTGG ACAGGCCCAA GCTGTGACTG TCCCCTCAGC I860 

AATGCCACCT GCATGGACAG CAATGGGGGC ATCTGTAATG GAOSTGGCCA CTGTGAGTGT 1920 

GGCOGCTGCC ACPGCCACCA GCAGTOGCTC TACACGGACA CCATCTGCGA GATCAACTAC 1980 

TCGGOGATCC ACCOGGGCCT CTGCGAGGAC CTACGCTCCT GCGTGCAGTG CCAGGCGTGG 2040 

GGCACCGGCG AGAA6AAGGG ■ GOGCACGTGT GAGGAATGCA ACTTCAAGGT CAAGATGGTG 2100 

GACGAGCTTA AGAGAGCCGA GGAGGTGGTG GTGCGCTGCT CCTTCOQGGA OGAGGATGAC 2160 

GACTGCACCT ACAGCTACAC CATGGAAGGT GACGGOGCCC CTGGGCCCAA CAGCACTGTC 2220 

CTGGTGCACA AGAAGAAGGA CTGCCCTCCG GGCTCCTTCT GGTGGCTCAT CCCCCTGCTC 2280 

CTCCTCCTCC TGOCGCTCCT GGCCCTGCTA CTGCTGCTAT GCTGGAAGTA CTGTGCCTGC 2340 

TGCAAGGCCT GCCTGGCACT TCTCCCGTGC TGCAACCGAG GTCACATOGT GGGCTTTAAG 2400 

GAA6ACCACT ACATGCTG06 GGAGAACCTG ATGGCCTCTG ACCACTTGGA CAGGCCCATG 2460 

CTGOGCAGCG GGAACCTCAA GGGCCGTGAC GTGGTCCGCT GGAAGGTCAC CAACAACATG 2520 

CAGCGGCCTG GCTTTGCCAC TCATGCCGCC AGCATCAACC CCACAGAGCT GGTGCCCTAC 2580 

GGGCIGTCCT TGOGCCTGGC CCGCCTTTGC ACOGAGAACC TGCTGAAGCC TGACACTCGG 2640 

GAGTGCGCCC AGCTGOGCCA G6AGGTGGAG GAGAACCTGA AOGAGGTCTA CAGGCAGATC 2700 

TCOGGTGTAC ACAAGCTCCA GCAGACCAAG TTCOGGCAGC AGCOCAATGC OGG6AAAAAG 2760 

CAAGACCACA CCATTGTGGA CACAGTGCTG ATGGCGCCCC GCTOGGCCAA GCOGGCCCTG 2820 

CTGAAGCTTA CAGAGAAGCA GGTGGAACAG AGGGCCTTCC ACGACCTCAA GGTGGCCCCC 2880 

GGCTACTACA OCCTCACTGC AGACCAGGAC GCCCGGC3GCA TGGTG6ACTT CCAGGAGGGC 2940 

GTGGA6CTGG TGGA0GTA06 GGTGCCCCTC TTTATCCGGC CTGAG6ATGA GGAOGAfiAAG 3000 

CAGCTGCTGG TGGAGGCCAT GGACGTGCCC GCAG6CACTG CCAGOCTOGG O06GG6CCTG 3060 
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GTAAACATCA CCATCATCAA GGAGCAAGCC AGAGAOGTGG TGTCCTTTGA GCAGCCTGAG 3120 

TTCXOOGTCA GCOGCGGGGA OCAGGTGGCC OGCATCOCTG TCATCOXSOG TGTCCTGGAC 3180 

GGCGGGAAGT CCCAGGTCTC CTACGGCAOl CAGGATGGCA COGCGCAGGG CAACCGGGAC 3240 

TACATOCOOG TGGAGGGTCA GCTtSCIGTTC CRGCCTGtSGG AGGCCTGGAA AGAGCTGCAG 3300 

GTGAAGCTCC TGGAGCTGCA AGAAGTTGAC TCCCTCCTGC GGGGCX3GCCA GGTCGGCCGT 3360 

TTCCAOGTCC AGCTCAGCAA CCCrAAGTTT GGGGCCCACC TGOGCCAGCC O CRCTOCftO C 3420 

ACCATCATCA TCAGGGACCC AGATGAACTG GACOGGAGCT TCACGAGTCA GATGTTGTCA 3480 

TCACAGCO^C CCCCTCACGG CGACCTCCGC GCCCCGCAGA ACCOCAATCC TAAGGCOGCT 3540 

GGGICCAGGA AGATCCATTT CAACTGGCTG CCCCCTTCTG GCAAGCCAAT GGGGTACflCG 3600 

GTAAAGTACT GGATTCAGGG TGACTCCGAA TCCGAAGCCC ACCTGCTOGA CAGCAAGGTG 3660 

CCCTCAGTCG AGCTCACCAA CCTGTACCOG TATTGOGACT ATGAGATCAA GGTGTGCGCC 3720 

TAOSGGGCTC AGGGCGACGG ACCCTACAGC TCCCTGGTGT CXTTGCOGCAC CCACX»GGAA 3780 

CTGCCCAGOG AGCCAGGGCG TCTGGCCTTC AATGTOGTCT CCTCCAOGGT GACCCAGCTG 3840 

AGCTCGGCIG AGCOGGCTGA GACCAAOGGT GAGATCACAG CCTACGAGGT CTGCTATGGC 3900 

CTGGTCAAOG ATGACAACCG ACCTATEGGG CCCATGAAGA AAGTGCTGGT TGACAACCCT 3960 

AAGAACCGGA TCCTGCTTAT TGAGAACCTT OGGGAGTCCC AGCCCTACCG CTACACGGTG 4020 

AAGGOGCGCA ACGGGGCCGG CTGGGG6CCT caCCGGGAGG CCATCATCAA CCTGGCCACC 4080 

CAGCCCAAGA GGCCCATGTC CATCCCCATC ATCCCTGACA T CCCTA TCGT GGAOSCCCAG 4140 

AGCGGGGAGG ACTACGACAG CTTCCTTATG TACAGOGATG ACX3TTCTA0G CTCTCCATC3G 4200 

GGCAGCCAGA GGCCCAGCGT CTCCGATCAC ACTGAGCACC TGGTGAATGG CCGGATGGAC 4260 

TTTGCCrTCC CGGGCAGCAC CAACTOCCTG CACAGGATGA CCACGACCAG TGCTGCTGCC 4320 

TATCGCAOCC ACCTGAGCCC ACACX3TGCCC CACCGCGTGC TAAGCACATC CTCCACCCTC 4380 

ACACGGGACT ACAACTCACT GACCOGCTCA GAACACTCAC ACTCGAOCAC ACTGCCGAGG 4440 

GACTACTCCA CCCTCACCTC OGTCTCCTOC CAOGACICTC GCCTGACTGC TGGTGTGCCC 4SO0 

GACACGCCCA CCOGCCTGGT GTTCTCTGCC CTGGGGCCCA CATCTCTCAG AGTGAGCTGG 4560 

CAGGAGCCGC GGTGCGAGCG GCCGCTGCAC 6GCTACAGTG TGGAGTACCA GCTG CTGAAC 4620 

GGCGGTGAGC TGCATCGGCT CAACATCCCC AACCCTGCCC AGACCTCGGT GGTGGTGGAA 4680 

GACCTCCTGC CCAACCACTC CTACGT6TTC OGOGTGCGGG CCCA6AGCCA GGAAGGCTGG 4740 

GGCCGAGAGC GTGAGQOTGT CATCACCATT GAATCCCAGG TGCACCOGCA GAGCCCACTG 4800 

TCTCCCCTGC CAGGCTCCGC CTTCACmO AGCACTCCCA GTGCCCCAGG CCCGCTGGTG 4860 

TTCACTGCCC TGAGCCCAGA CTCGCTGCAG CTGAGCTGGG AGCGGCCACG GAGGCCCAAT 4920 

GGGGATATCG TCGGCTACCT GGTGACCTGT GAGATGGCCC AAGGAGGAGG GCCAGCCACC 4980 

GCATTCCGGG TGGATGGAGA CAGCCCCGAG AGCCGGCTGA CCGTGCCGGG CCTCAGOGAG 5040 

AACCXGCCCT ACAAGTTCAA GGTGCAGGCC AGGACCACTG AGGGCTTCGG GCCA6AGCGC 5100 

GAGGGCATCA TCACCATAGA GTCCCAGGAT GGAGGACCCT TCCCGCAGCT GGGCAGCCGT 5160 

GCCGGGCTCT TCCAGCACCC GCTGCAAAGC GAGTACAGCA GCATCACCAC CACCCACACC 5220 

AGCGCCACCG AGCCCTTCCT AGTGGATGGG COGACCCTGG GGGCCCAGCA CCT0GAG6CA 5280 

GGCGGCTCCC TCACCOGGCA TGTGACCCAG GAGTTTGTGA GCCGGACACT GACCACCAGC 5340 

GGAACCCTTA GCACCCACAT GGACCAACAG TTCTTCCAAA CTTGACCGCA CCCTGCCCCA 5400 

CCCCCGCCAT GTCCCACTAG GCGTCCTCCC GACTCCTCTC COGGAGCCTC CTCAGCTACT 5460 

CCATCCTTGC ACCCCTGGGG GCCCAGCCCA CCCGCATGCA CAGAGCAGGG GCTAGGTGTC 5520 

TCCTGGGAGG CAT6AAGGGG GCAAGGTCCG TCCTCTGTGG GCCCAAACCT ATT TGTA ACC 5580 

AAAGAGCTGG 6AGCAGCACA AGGACCCAGC CTTTGTTCTG CACTTAATAA ATGGTTTTGC 5640 
TACTG 

Seq ID NO: 561 Protein sequence 
Protein Accession ft; NP_000204.l 

I 11 21 31 41 51 

II 1 1- i I 

MAGPRPSPWA RLLLAALISV SLSGTLAKRC KKAPVKSCTE CVRVDKDCAY CT DEMF RDRa 60 

CNTQAELLAA GCQRESIWM ESSFQITBET QIDTTLRRSQ MSPQGLRVRL RPGEBRHFEL 120 

BVFBPLESPV DLYILMDFSN SMSDDIDNLK KMGQNLARVL SQLTSDYTIG FGXFVDKVSV 180 

PQTDMRPEKL KEPWPNSDPP PSPKNVISLT EDVDEFRNKL QGERISGNLD APBGGFDAIL 240 

QTAVCTRDIG WRPDSTHLLV FSTESAFHYE ADGANVLAGI MSRNDBRCHL DTTGTYTQYR 300 

TQDYPSVPTL VRLLAKHNII PIFAVTNYSY SYYEKLHTYP PVSSLGVLQB DSSNIVELLE 360 

EAFNRIRSNU DIRALDSPRG LRTEVTSKMF QKTRTGSPHI RRGEVGIYQV QLRALEHVDG 420 

THVCQLPEDQ KOUHLKPSF SDCLiO-IDAGI ICDVCTCELQ KEVRSARCSF NGDFVOGQCV 480 

CSEGWSGQTC MCSTGSLSDI QPCLREGEDK PCSGRGECQC GHCVCYGEGR YEGQFCEYDN 540 

PQCPRTSGFL CNDRGRCSMG QCVCEPGWTG PSCDCPLSNA TCIDSNGGIC NGRGHCECGR 600 

CHCHQQSLYT DTICEINYSA IHPGI^CEDLfi SCVQCQAWGT GEKKGRTCEE CNFKVKMVDE 660 

LKRAEEWVR CSFRDEDDDC TYSYTMBXSDG APGPNSTVXiV HKKKDCPPGS FWWLIPLLtiL 720 

LLPLLALLLL UMKYCACCK ACLAIiLPCai RGHMVGFKBD HYMLRENI«A SDHLDTPMLR 780 

SGNLKGRDW RWKVTNNMQR PGPATHAASI NPTELVPYGL SLRLARLCTE NLIiKPDTREC 840 

AQLRQEVEEN LNEVYRQISG VHKLQQTKFH QQPNAGKKQD HTIVDTVLKA PRSAKPALUC 900 

LTEKQVEQRA FHDLKVAPGY YTLTADQDAR GMVEPQEGVE LVDVRVPLFI RPEDDDEKQL 960 

LVBAIDVPAG TATLGRRLVN ITIIKEQARD WSFEQPEFS VSRGDQVARI PVIRRVLDGG 1020 

KSQVSYRTQO GTAQGNRDYI PVGG6LLFQP GEAWKELQVK IJiELQEVDSIi LRGRQVRRFU 1080 

VQIiSNPKFGA HLGQPHSTTI IIRDPDELDR SFTSQMZiSSQ PPPHGDL6AP QNPNAKAAGS 1140 

RKIHFNWLPP SGKPMGYRVK YWIQGDSESE AHLLDSKVPS VELTNLYPYC DYEMKVCAYG 1200 

AQGBGPYSSL VSCRTHQEVP SBPGRLAFNV VSSTVTQLSW AEPAETNGEI TAYEVCYGIiV 1260 

NDDNRPIGPM KKVLVDNPKN RMI*L1ENLRE SQPYRYTVKA RKGAGWGPSR EAIINLATQP 1320 

iCRPMSIPIIP DIPIVDAQSG EDYDSFLMYS DDVLRSPSGS QRPSVSDDTE HLVNGRMDFA 1380 

FPGSTNSLHR HTTTSAAAYG THLSPHVPHR VLSTSSTLTR DYNSLTRSEH SHSTTLPRDY 1440 

STLTSVSSHD SRLTAGVPDT PTRLVFSALG PTSIiRVSWQE PRCERPWJGY SVEYQLLNGG 1500 

ELHRLNZPNP AQTSVWEDL LPNRSYVPRV SAQSQEGWGR GREGVITIES QVHPQSPLCP 1560 

LPGSAFTLST PSAPGPLVFT ALSPDSXiQLS NERPRRPKGD IVGYtiVTCEH AQGG6PATAF 1620 

RVDGDSPESR LTVPGLSENV PYKFKVQART TBGFGPERBG IITIESQDGG PPPQLGSRAG 1680 

LFQHPLQSEY SSITTTHTSA TEPFLVDGPT LGAQHLEAGG SLTRHVTQEF VSRTLTTS6T 1740 
LSTHMDQQPF QT 

Seq ID MO: 562 DNA sequence 

Nucleic Acid Accession ft: NM_013 332.1 

Coding sequence : 1 . . 63 

1 11 21 31 41 51 

I ! i t I I 
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GCAOGAGGGC GCTTTTGTCr C0GGTGAG7T TTGTGGOGOG AAGCTTCTGC GCrGGTGCIT 60 

AGTAACOGAC TTTCCTCCGG ACTCCTGCAC GACCTGCTCC TACAGOOGGC GATCCACTCC 120 

OGGCTGTTCC CCOGGAGGGT CCAGAGGCCT TTCAGAAGGA GAAGGCAGCT CTGTTTCTCT 180 

GCAGAGGAGT AGGGTCCTTT CAGCCATGAA GCATGTGTTG AACCTCTACC TGTTAGGTGT 240 

GGTACTGACC CTACTCTCCA TCTTOGTTAG AGTGATGGAG TCCCTAGAAG GCTTACTAGA 300 

GAGCCOVTOO OCTOGGACCT CCT GG ftCCAC CAGAAGGCAA CTAOCOIACA CAGAGOOCAC 360 

CAAGGGCCTT CCAGACCATC CATCCAGAAG C3VTGTGATAA GACCTCCTTC CATACIGGOC 420 

ATATTTTGGA ACACTGACCT ACACATGTCC AGATGGGAGT CCCATTCCTA GCAGACAAGC 480 

TGAGCACOGT TGTAACCAGA GAACTATTAC TAGGCCTTGA ACAACCTGTC TAACTGGATG 540 

CTCATT G CCT GGGCAAGGCC TGTTTAGGCC GGTTGOGGTG GCTCATGCCT GTAATOCTAG 600 

CACTTTGGGA GGCTGAGGTG GGTGGATCAC CTGAGGTCAG GAGTTCGAGA CCAGCCTCGC 660 

CAACATGGCG AAACCCCATC TCTACTAAAA ATACAAAAGT TAGCTGGGTG TGGTGGCAGA 720 

GGCCTGTAAT CCCAGTTOCT TGGGAGGCTG AGGCGGGAGA ATTGCTTGAA CCOGGGGAGG 780 

GAGGTTGCAG TGAACCGAGA TCGCACTGCT GTACCCAGCC TGGGCCACAG TGCAAGACTC 840 

CATCTCAAAA AAAAAAAGAA AAGAAAAAGC CTGTTTAATG CACAGGTGTG AGTGGATTGC 900 

TTATGGCTAT GAGATAGGTT GATCTCGCCC TTACCCOGGG GTCTGGTGTA TGCTGTGCTT 960 

TCCTCAGCAG TATGGCTCTG ACATCTCTTA GATGTCCCAA CTTCAGCTGT TGGGAGATGG 1020 

TGATATTTTC AACCCTACTT CCTAAACATC TGTCTGGGGT TCCTTTAGTC TTGAATGTCT 1080 

TATGCTCAAT TAITTGGTCST TGAGCCTCTC TTCCACAAGA GCTCCTCCAT GTTTOGATAG 1140 

CAGTTGAAGA GGTTGTGTGG GTGGGCTGTT GGGAGTGAGG ATGGAfiTGTT CRGTGCCCAT 1200 

TTCTCATTTT ACATTTTAAA GTCGTTCCTC CAACATAGTG TGTATTGGTC TGAAGGGGGT 1260 

GGTGGGATGC CAAAGCCTGC TCAAGTTATG GACATTGTGG CCACCATGTG GCTTAAATGA 1320 
TTPTTTCTAA CTAATAAAGT GGAATATATA TTTCAAAAAA AAAAAAAAAA AA 

Seq ID NO: S63 Protein sequence 
Protein Accession S; NP_037464.1 



PCT/US02/12476 



1 11 21 31 41 SI 

OA I 1 J I i I 

30 MKHVIiNIiYLIi GWLTIiLSIP VRVMESLBGL LESPSPGTSH TTRSQIANTE PTXGLPOBPS 

RSM 



60 



35 
40 
45 
50 
55 
60 



Seq ID NO: 564 DNA sequence 

Nucleic Acid Accession ft: im_023 915.1 

Coding sequence: 250.. 1326 



1 
I 

GGCACGAGGG 
TCAAAGCTTA 
GTGAATGGAC 
CCCACX5CCTC 
AACTGAAGAA 
CAAGAGAGTC 
AATGAATTTG 
TTGCTGAATG 
TTCTATCTCA 
ATAGTCCATG 
TCAGTTTTGT 
GATCGCTATC 
ACGAAGGTTT 
.ATCCTGACAA 
CCTTTGGGGG 
GTGCTGGTGA 
AGGCAATTCA 
GTGGCTGTGT 
AGTCACTTAG 
ATTACACTTT 
TGTAGGTCAT 
ATCAGATCAC 
GTGTAGGCCT 
TTCATTATCC 



11 
I 

TTTCGTTTTC 
TTCTTAATTA 
AGCCAGCCAC 
AATCGTCCCC 
TGGGGTTCAA 
ACAATTCAGG 
ACACAATTGT 
6TTTAGCAGT 
AAAACATAGT 
ATGCAGGATT 
TTTATGCAAA 
TGAAGGTGGT 
TATCTGTTTG 
ATGGTCAGCC 
TCAAATGGCA 
TTCTGATCGG 
TAAGTCAGTC 
TTTTTACCTG 
ACAGGCTTTT 
TCTTGTCTGC 
TTTCAAGAAG 
TGCAAAGTGT 
TTTATTGTTT 
TTAAAAAAAA 



21 
I 

ATGCTTTACC 
GAGACAAGAA 
CACAATGAAA 
AAGTGTTTCC 
CTTGACGCTT 
CAACAGGA6C 
CTTGCGGGTG 
GTGGATCTTC 
GGTTGCAGAC 
TGGAOCTTGG 
CATGTATACT 
CAAGCCATTT 
TGTTTGGGTG 
AACAGAG6AC 
TACX3GCAGTC 
ATGTTACATA 
AAGCOGAAAG 
CTTTCTACCA 
AGATGAATCT 
GTGTAATGTT 
GCTGTTCAAA 
GAGAAGATOG 
GTTGGAATOG 
AA 



31 
1 

AGAAAATCCA 
ACCTGTTTCA 
GAAATCAAAC 
TGACACGCAT 
GCAAAATTAC 
GACGG6CCAG 
CTTTATCTCA 
TTCCACATTA 
CTCATAATGA 
TACTTCAAGT 
TCCATCGTGT 
GGGGACTCTC 
ATCATGGCTG 
AATATCCATG 
ACCTATGTGA 
GCCATATCCA 
CGAAAACATA 
TATCACTTGT 
GCACAAAAAA 
TGCCTGGATC 
AAATCAAATA 
GAAGTTCGCA 
ATATGTACAA 



41 
I 

CTTCCCTGCC 
ACTTGAAGAC 
CAGGAATAAC 
CTTTGCTTAC 
CAAATAACX3A 
GAAAGAACAC 
TTATATTTGT 
GGAATAAAAC 
CGCTGACATT 
TTATTCTCTG 
TCCTTGGGCT 
GGATGTACAG 
TTTTGTCTTT 
ACTGCTCAAA 
ACAGCTGCTT 
GGTACATCCA 
ACCAGAGCAT 
GCAGAATTCC 
TCCTATATTA 
CAATAATTTA 
TCA6AACCAG 
TATATTATGA 
AGT6TAAATA 



51 
I 

GACCTTAGTT 
ACCXTTATGAG 
CTATGCTGAA 
AGTGCATCAC 
GCTGCACGGC 
CACCCTTCAC 
GGGAAGCATC 
CAGCTTCATA 
TCCATTTCGA 
CAGATACACT 
GATAAGCATT 
CATAACCTTC 
GCCAAACATC 
ACTTAAAAGT 
GTTTGTGGCC 
CAAATCCAGC 
CAGGGTTGTT 
TTTTACTTTT 
CTGCAAAGAA 
CTTTTTCATG 
GAGT6AAAGC 
TTACACTGAT 
AATGTTTCTT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



65 



Seq ID no I 565 Protein sequence 
Protein Accession KP 076404 



70 
75 
80 
85 



KGFNLTLAKL 
GLAVWIFFHI 
FYANMYTSIV 
NGQPTEDNIH 
ISQSSRKRKH 
FLSACNVCLD 



11 
I 

PNNELHGQES 
RNKTSFIPYL 
FLGLISXDRY 
DCSiOiKSPIiG 
NQSIRVWAV 
PIIYFFMCRS 



21 
1 

HNSGNRSDGP 
KNIWADLIH 
LKWKPFGDS 
VKMHTAVTYV 
PFTCPLPYHL 
FSRRI.FXKSN 



31 
1 

GKNTTLHNEF 
TLTFPFRIVH 
HHYSITPTKV 
KSCLPVAVLV 
CRIPFTFSHL 
IRTRSESIRS 



41 
1 

DTIVLPVLYL 
DAGFGPWYFK 
LSVCVWVIMA 
XLIGCYIAIS 
DRLLOESAQK 
LQSVRRSEVR 



SI 
I 

IIFVASILLN 
FILC»YTSVL 
VLSLPNIILT 
RYIHK55RQF 
ILYYCKEITL 
lYYDYTDV 



Seq ID NO: 566 DNA sequence 

Nucleic Acid Accession #: NM_0OS365.1 

Coding sequence: 1..948 



1 
I 

ATGTCTCTCG 
GAG6ACTTGG 
TCCTCTGACA 
CCTCAGGGAG 
GA6GGCTCCA 
GAGTTCATQT 



11 
1 

AGCAGAGGAG 
GCCTGATGGG 
OCAAGGAGGA 
GOGCTTCCTC 
GCAGTCAAGA 
TCCAAGAAGC 



21 
1 

TCCGCACTGC 
•TGCACAGGAA 
GGAGGTGTCT 
CTCCATTTCC 
AGAGGAAGAG 
ACTGAAATTG 



31 

1 

AAGCXTTGATG 
CCCACAGGGG 
GCTGCTGGGT 
GTCTACTACA 
CCAAGCTCCT 
AAGGTGGCTG 



41 

I 

AAGACCTTGA 
AGGAGGAGGA 
CATCAAGTCC 
CTTTATGGAG 
GGGTOGACCC 
AGTPGGTTCA 



51 

I 

AGCCCAAGGA 
GACTACCTCC 
TCCCCAGAGT 
CCAATTCGAT 
AGCTCAGCT6 
TTTCCTGCrC 



60 
120 
180 
240 
300 



60 
120 
180 
240 
300 
360 



400 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
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75 
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OVCAAATATC GAGTOVAGGA GCOGGTCACA AACGCAGAM TGCtt36ACAC 
AMTACAAGC GCTACTTTCC TeiGAICTTC GGCftAAGCCT OOGRGTICAT 
TTIG6CACTG ATGTSMGGA OGTGGACOOC COCGG O»CT CCTACATCCT 
CTIG O OCICT OGTGCGATAG CATGCTGGCT GATGGTCATA GCRTGCCCaA 
CIGATCATTG TCCTGGGTGT GATCCTMOC AAAGACAACT GCGCCCCTGft 
TGGGAAGOST TQAGTGTGAT GGGGGTGTAT GTTGGGAAGG AGCACATGTT 
OCCAGQAAGC TGCICACOCA ACATTGGOTO CRGGAAAACT ACCIGGAOTA 
OCOSGCAeiG ATOCTGOSCA CTACGACITC CTGTCGGGTT CCAAGGCCCA 
nSCTATGASA AOGICATAAA tTATTTGGTC ATGCTCAATG CAAGAGAGCC 
CCATOCCTTT ATCAAGAGGT TTTGGQAGAG GAGCAAGAGG GAGTCTGA 

Seq ID KO: 567 Protein sequence 
Protein Accession 8: NP_0053S«.l 

I 11 21 31 41 

I I I I I 

KSliBQRSPHC KPDEDLEAQG EDLGLMGAQE PTGEEEETTS SSDSKEEEVS 
PQGGASSSIS VYYTXjWSQPD EGSSSQEEEE PSSSVDPAQL EFMEXJEALKL 
HKYRVKEPVT KAEMLSSVIK NYKRYFPVIF GKASEFMQVI FGTDVKEVDP 
LGLSCDSMLG DGHSMPKAAL LIIVLGVIXjT KDNCAPEEVI WEALSVMGVY 
PRXLLTQDWV QENYLBYRQV P6SDPAHYEF LKGSKAHAET SYBKVINYLV 
PSLYEBVL6E EQB6V 

Seq ID IX): 568 DNA sequence 
Nucleic Acid AccesBica 8: IIM_014400 
Coding sequence: 86.. 1126 



PCTAJS02/12476 



OSTCATCAAA 
GCAGGTGATC 
TGTCACTGCT 
GGGOGCCCTC 
AGAGGTTATC 
CTACGGGGAG 
COGGCAGGTG 
GSCTGAAACC 
CATCTGCXAC 



51 
I 

AAGSSSPPQS 
KVAELVHFLL 
AGHSYILVTA 
VGKERMFYGE 
MLHASEPICY 



1 

! 

G6TTACTCAT 
GACGCCAAGG 
GATCTGGACT 
GTGCTACAGC 
GAAGTGOGGG 
CGGACAATTC 
CGGCCTGGAT 
CTGCAACGCC 
ATACCCXKTCC 
GGGTACATCG 
CTTCGACGGC 
CTGTGTCCAG 
TGGCTCCTGT 
CCCTCGAATC 
CACATCTGTC 
GCCAGCGCCA 
GGAGCCCAGG 
TCCTGCAAAA 
ATTGGCAGCC 
AAATTTCCCT 
CCCACCACTG 

GGGTGTTCTA 
TCCTCTTGTG 
AGGATGCTAA 
GGTGGGACAA 
ATOGGTTCCC 
CTTATGTCTG 
TTGTATAGTG 



11 
I 

CCTGGGCTCA 
GAGCAOGACG 
GCAGGCTGGC 
TGCGTGCAGA 
CGGGGGGTGG 
TG6CTGGGAG 
CTTCAOGGGC 
AAGCTCAACC 
AACGGCGTGG 
CCGCXX3GTCG 
AACXTTCACCT 
GATGAATTCT 
TGCCAGGGGT 
CCACCCCTTG 
ACCACTTCTA 
ACCAGTCAGA 
TTGACTGGAG 
GGGOGGCGCC 
CTTCTGTT6G 
CTCACCTACT 
GACTGGGCTG 
GCTGGTTTGC 
GCTTTTTGAG 
ATGTTAGGAC 
GCTTCCTACT 
TGGCTCCCCA 
CATATGTCTT 
TGTGTGATCA 



21 
I 

GGTAAGAGGG 
GAGCCATGGA 
TGCTGCTGCT 
AAGCAGATGA 
AOGTCTGCAC 
TGCSGGGTTG 
TTCTGGOGTT 
TCACCTCGCG 
AGTGCTACAG 
TGAGCTGCTA 
TGACGGCAGC 
GCACTCGGGA 
CCCGCTGTAA 
TCOGGCTGCC 
CCTCGGCCCC 
CTCCGAGACA 
GCGCOGCTGG 
AGCAGCCCCA 
CCGTGGCTGC 
TCTCTGGCCC 
GCCCAGCCCC 
GGCTTTGGGA 
GACAGCTCCT 
AGAGTGAGAG 
CACTTTCTCC 
CTCTAAGCAC 
CCTTACTAGA 
GTTTCTGGCA 



Seq ID NO: 569 Protein sequence 
Protein Accession #: NP_055215 



1 
I 

MDPARKAGAQ 
CTEAVGAVET 
SRALDPAGKE 
AANVTVSIiPV 
LPPPEPTTVA 
AGHQDRSNSG 



11 
1 

AMIWTAGWIX 
IHGQFSLAVX 
SAYPPNGVEC 
RGCVQOEFCT 
STTSVTTSTS 
QYPAKGGFQQ 



21 
I 

r.T.T.T.RGGAQA 
GOGSGLPGKN 
YSCVGLSREA 
ROGVTGFGFT 
APVRPTSTTK 
PHNKGCVAPT 



Seq ID KOi 570 ONA sequence 
Nucleic Acid Accession ft: HM_005329.1 
coding sequence : 1 . . 1662 



ATGCOGGTGC 
GTGCTGGGTG 
CACTACCTGT 
CTTTTTGCCT 
TCCCCGGGGC 
TTGCGCAAGT 
GTGGTGGATG 
GGCGGCACCG 
GGTG AGAO GG 
AGCACCTTCT 



11 
I 

AGCTGACGAC 
GCATCCTGGC 
CCTTCGGCCT 
TCCTGGAGCA 
GGGGCT06GT 
GCCTGCGCTC 
GCAACOGCCA 
AGCAGGCOGG 
AGGCCAGCCT 
CGTGCKTCAT 



21 
1 

AGCCCTGCGT 
AGCCTATGTG 
GTACGGCGCC 
GCGGCGCATG 
GGCACT6TGC 
GGCCCAGCGC 
GGAGGACGCC 
CTTCTTTGTG 
GCAGGAGGGC 
GCAGAAGTGG 



31 
I 

GTGGTGGGCA 
ACGGGCTACC 
ATCCTGGGCC 
GGACGTGCOG 
ATTGCOGOGT 
ATCTCCTTCC 
TACATGCTGG 
TGGCGCAGCA 
ATGGACOGTG 
GGAGGCAAGC 



41 

I 

CCAGCCTGTT 
AGTTCATCCA 
TGCACCTGCT 
GGCAGGCCCT 
ACCAGGAGGA 
CTGACCTCAA 
ACATCTTCCA 
ACTTCCATCA 
TG0GG6ATGT 
GGCAG6TCAT 



420 
480 
540 
600 
660 
720 
780 
640 
900 



60 
120 
180 
240 
300 



31 


41 


51 




I 

CCC6AGCTC6 


1 

6AG6CGGCAC 


1 

ACCCAGGGGG 


60 


CCCCGCCAGG 


AAAGCAGGTG 


CCCAGGCCAT 


120 


GCTGCTTCGC 


GGAGGAGCGC 


AGGCCCTGGA 


180 


0G6ATGCTCC 


CC6AACAAGA 


TGAAGACAGT 


240 


OGAGGCCGTG 


GGGGCGGTGG 


AGACCATCCA 


300 


CGGTTCGGGA 


CTCCCCGGCA 


AGAATGACCG 


360 


CATCCAGCTG 


CAGCAATGCG 


CTCAGGATOG 


420 


GGCGCTCGAC 


OCGGCAGGTA 


ATGAGAGTGC 


480 


CTGTGTGGGC 


CTGAGCC6GG 


AGGCGTGCCA 


540 


CAACGCCAGC 


GATCATGTCT 


ACAAGGGCTG 


600 


TAATGTGACT 


GTGTCCTTGC 


CTGTCCGGGG 


660 


TGGAGTAACA 


GGCCCAGGGT 


TCACGCTCAG 


720 


CTCTGACCTC 


GGGAACAAGA 


CCTACTTCTC 


780 


CCCTCCAGA6 


CCCACGACTG 


TGGCCTCAAC 


840 


AGTGAGACCC 


ACATCCACCA 


CCAAACCCAT 


900 


GGGAGTAGAA 


CACGAGGCCT 


CCCGGGATGA 


960 


CCACCAGGAC 


CGCAGCAATT 


CAGGGCAGTA 


1020 


TAATAAAGGC 


TGTGTGGCTC 


CCACAGCTGG 


1080 


TGGTGTCCTA 


CTGTGAGCTT 


CTCCACCTGG 


1140 


TGGGTACCCC 


TCTTCTCATC 


ACTTCCTGTT 


1200 


TGTTTTTCCA 


ACATTCCCCA 


GTATCCCCAG 


1260 


AATAAAATAC 


OC3TTGTATAT 


ATTCTGGCAG 


1320 


GTATCCTTCT 


CATCCTTGTC 


TCTCCGCTTG 


1380 


AAGTCAGCTG 


TCACGGGGAA 


GGTGAGAGAG 


1440 


TAGCCAGCCT 


GGACTTTGGA 


GCGTGGGGTG 


1500 


TGCCTCCCCT 


ACTCCCCGCA 


TCTTTGGGGA 


1560 


CTGTGAGCTC 


CT06AGGGCA 


GG6ACCGTGC 


1620 


CATAAAT6CC 


TCAATAAAGA 


TTTAATTACT 


1680 


31 


41 


51 




1 

LECYSCVQKA 


1 

DDGCSPNKMK 


1 

TVKCAPGVDV 


60 


ORGLDIiHGLL 


AFIQLQQCAQ 


DRCNAKLNLT 


120 


CQGTSPPWS 


CYNASDHVYK 


GCFOQJVTLT 


180 


LSGSCOQGSR 


CNSDXANRTY 


FSPRIPPLVR 


240 


PMFAPTSQTP 


RQGVEHBASR 


DEEPRLTGGA 


300 


AGXiAALLLAV 


AAGVLL 







51 
I 

TGCCCTGGCA 
CACGGAAAAG 
CATTCAGAGC 
GAAGCTGCCC 
CCCTGACTAC 
GGTGGTCATG 
C6AGGTGCTG 
GGCAGGCGAG 
GGTGOGGGCC 
GTACAOGGCC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



401 
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TTCAAGGCCC TOGGOtSATTC GGTGGACTAC ATCCAGGTGT GOGACTCTGA CACTGTCCTG 660 

GATCCAGCCT GCACCATCGA GATGCTTCGA GTCCTGGAGG AGGATCCCCA AGTRGGGGGA 720 

GTOGGGGGA.G ATGTCCAKVT CCTCAACAAG TAOGACTCAT GGATTTCCTT CCTGAGCAGC 780 

GTGC3GGTACT GGATGGCCTT CAACGTGGAG CGGGCCTGCC AGTCCTACTT TGGCTGTGTG 840 

CAGTGTATTA GTGGGCCCTT GGGCATGTAC OGCAACAGCC TCCTCCAGCA GTTCCTGGAG 900 

GACTGCSTACC ATpUSAAGTT CCTAGGCAOC AAGTGCACCT TGGGGGATGA CCGGCAOCTC 960 

ACCAACCGAG TCCTGAGCCT TGGCTACOGA ACTAACTATA CCOCGCGCTC CAAGTGCCTC 1020 

ACAGAGACCC CCACTAAGTA CXTCCGGTGG CTCAACCAGC AAACCCGCTG GAGCAAGTCT 1080 

TACTTCCGGG AGTGGCTCTA CAACTCTCTG TGGTTCCATA AGCACCACCT CTGGATCACC 1140 

TAOGAGTCAG TCGTCAOGGG TTTCTTOCCC TTCTTCCTCA TTGOCACGGT TAT ACAGC TT 1200 

TTCTAGOGGG GCCGCATCTG GAACATTCTC CTCTTCCTGC TGAaSGTGCA GCTGGTGGGC 1260 

ATTATCAAGG CCACCTACGC CTGCTTCCTT OQGGGCAATG CAGAGATGAT CTTCATGTCC 1320 

CTCTACTCCC TCCTCTATAT GTCCAGCCTT CTGCCGGCCA AGATCTTTGC CATTGCTACC 13 BO 

ATCAACAAAT CTGGCTGGGG CACCTCTGGC CJGAAAAACCA TTGTGGTGAA CTTCATTGGC 1440 

CTCATTCCTC TGTCCATCTG GGTGGCAGTT CTCCTG GGAG GGCTGGCCTA CACAGCTTAT 1500 

7G0CAGGACC TGTTCAGTGA GACAGAGCTA GCCTTCCTTG TCTCTGGGGC TATACTGTAT 1560 

GGCTGCTACT GGGTGGCCCT CCTCATGCTA TATCTGGCCA TCATCGCCOG GCGATGTGGG 1620 
AAGAAGCCGG AGCAGTACAG CTTGGCTTTT GCTGAGGTGT GA 

Seq ID NO: 571 Protein sequence 
Protein Accession 8: NP_005320.1 

1 11 21 31 41 51 

I 1 1 ! I i 

HPVQLTTALR WGTSLFALA VLGGILAAYV TGYQFIHTEK HYLSFGLYGA ILGLBLLIQS 60 

LFAFLEHRRM RRAGQALiCLP SPRRGSVALC lAAYQEDPOY LRKCLRSAQR ISPPDUCWM 120 

WOGNRQEDA YMLDIFHEVL GGTEQAGFFV WRSNFHEAGE GETEASLQEG MDRVRDWRA 180 

STFSCIMQiCH GGKREVMYTA FKAbGDSVDY IQVCDSDTVL DPACTIEMLR VLEEDPQVGG 240 

VGGDVQILNK YDSWISFLSS VRYWMAFNVE RACQSYFGCV QCISGPLGMY RNSLLQQPLE 300 

DWYHQKFLGS KCSPGDDRHL TKRVLSIX3YR TKYTARSKCL TETPTKYLRW LNQQTRWSKS 360 

YFREWLYNSIi WFHKHHLWMT YESWTGFFP E7LIATVIQL PYRGRIWNIL tPLLTVQLVG 420 

IIKATYACFL RGMAEMIFMS LYSLLYKSSL LPAKIFAIAT ZNKSGWGTSG RKTIWNFIG 480 

IjIPVSIWVAV LLGGLAYTAY CQDLFSETEL APIiVSGAILY GCYHVALLML YLAIIARROG 540 
KKPEQYSLAF AEV 

Seq ID MO: 572 DNA sequence 

Nucleic Acid Accession 8: Eos sequence 

Coding sequence: 14S*-7095 

1 11 21 31 41 51 

I I I I I I 

CACACATACG CACGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 120 

CX3GGGAGGGG CCGCAGACXX3 TCTGGAAATG OGAATCCTAA AGOGTTTCCT CGCTTGCATT 180 

CAGCTCCTCT GTGTTTGCOG CCTGGATTQG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CrrGTTGAAG A6ATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TCACTACCGT 480 

GTCAGC3G6AG GAGTTTCAGA AATGGTGTTT AAA6CAAGCA AGATAACTTT TCACTGGGGA 540 

AAAltSCAATA TGTCATCTGA TGGATC»GAG CaTAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGATGCGGAC CGATTTTCAA GTTTTGAGGA A6CAGTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAG aSATTATTGA TGGAGTCGAA AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GT T TTT T GTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA C3KAAACAATT TTOGAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTCGAGTCGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTOGACAT GCCTACTGAT 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AACAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGOGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA ISOO 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TCGCATAGGG 1560 

AOGAAATACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

GAAGGTACTT CAGOCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1600 

AACTTGTCG6 GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGAtGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAG6 ATGCTTCTAT GGAGGGAAAT 2100 

CTGTGGTTTC CTAGCTCTAC AGACATAACA GCACAGCCC6 ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA CGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATTTGGT CTCXACGGTC AACGTGGTAT ACTCX5CAGAC AACCCAACCG 2400 

GTATACAATG GTGAGACACC TCTTCAACCT TCCTACAGTA GTGAAGTCTT TCCTCTAGTC 2460 

ACCCCTTTGT TGCTTGACAA TCAGATCCTC AACACTACCC CTGCTGCTTC AAGTAGTGAT 2520 

TOGGCCTTGC AT6CTA0GCC TGTATTTCCC AGTGTOGATG TGTCATTTGA ATCCATCCTG 2580 

TCTTCCTATG ATGGTGCACC TTTGCTTCCA TTTTCCTCTG CTTCCTTCAG TAGTGAATTG 2640 

TTTCGCCATC IGCATACAGT TTCTCAAATC CTTCCACAAG TTACTTCAGC TACOGAGAGT 2700 

GATAAGGTGC 0CTTGCAT6C TTCTCTGCCA GTGGCTGGGG GTGATTTGCT ATTA6AGCCC 2760 
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AGCCTPGCTC AGTATTCTGA TGTGCTGTCC ACTACTCATG CTGCTTCAGA GACGCTGGAA 2820 

TTTGGTAGTG AATCTGGTGT TCTTTATAAA AOGCTTATGT TTTCTCAAGT TGAACCAOOC 2880 

AGCAGTGATG CCATCATGCA TGCAO G TTCT TCfiCSSOCTG AACCITCrrA TGCCTTCTCT 2940 

GATAATGAGG GCTOCCAACA CATCTTCACT GTTTCTTACA GTTCSGCAAT ACCTGTGCAT 3000 

GATTCTGTGG GTGTAACTTA TCSUSGGTTCC TTATTTAGCG GCCCTAGCCA TATACCAATA 3060 

CCTAAGTCTT OG 'I TA ATAAC CCCAACTGCA TCATTACTGC AGCCTACTCA TGCCCTCTCT 3120 

GGTGATGGGG AATGGTCTGG AGCCTCTTCT GATAGTGAAT TTCTTTTACC TGACACAGAT 3180 

GGGCTXIACAG CCCTTAACAT TTCTTCACCT GTTTCTGTAG CTGAATTTAC ATATACAACA 3240 

TCTGTGTTTG GT6ATGATAA TAAGGOSCTT TCTAAAAGTG AAATAATATA TGGAAATGAG 3300 

ACTGAACTGC AAATTCCTTC TTTCAATGAG ATGGTTTACC CTTCTGAAAC CACAGTCATG 3360 

CCCAACATGT ATGATAAIGT AAATAACTTG AATGCGTCTT TACAAGAAAC CTCTGTTTCC 3420 

ATTTCTAGCA CCAAGGGCAT GTTTCCAGGG TCCCTTGCTC ATACCACCAC TAAGGTTTTT 3480 

GATCATGAGA TTAGTCAAGT TCCAGAAAAT AACTTTTCAG TTCAACCTAC ACATACTGTC 3540 

TCrCAAGCAT CTGGTGACAC TTCGCTTAAA CCTGTGCTTA GTGCAAACTC A GAGC CAGCA 3600 

TCCTCTGACC CTGCTTCTAG TGAAATGTTA TCTCCTTCAA CTCAGCTCTT ATTTTATGAG 3660 

ACCTCAGCrr CTTTTAGTAC TGAAGTATTG CTACAACCTT CCTTTCAGGC TTCTGATQTT 3720 

GACACCTTGC TTAAAACTCT TCTTCCAGCr GTGCCCAGTG ATCCAATATT GGTT6AAACC 3780 

CCCAAAGTTG ATAAAATTAG TTCTACAATG TTGCATCTCA TTGTATCAAA TTCTGCTTCA 3840 

AGTGAAAACA TGCTGCACTC TACATCTGTA CICAGTTTTTG ATGTGTCGCC TACTTCTCAT 3900 

ATGCACrCTG CTTCACTTCA AGGTTTGACC ATTTCCTATG CAAG TGAGAA ATATGAACXA 3960 

GTTTTGTTAA AAAGTGAAAG TTCCCACCAA GTGGTACCTT CTTTGTACAG TAATGATGAG 4020 

TTXyrrCCAAA CGGOCAATTT GGAGATTAAC CAGGCCCATC CCCCAAAAGG AAGGCATGTA 4080 

TTTGCTACAC CTGTTrTATC AATTGATGAA CCATTAAATA CACTAATAAA TAAGCTTATA 4140 

CATTCCGATG AAATTTTAAC CTCCACCAAA AGTTCTGTTA CTGGTAAGGT ATTTGCXGGT 4200 

ATTCCAACAG TTCCTTCTGA TACATTTGTA TCTACTGATC ATTCTCrrCC TATAGGAAAT 4260 

GGGCATGTTG CCATTACAGC TGTTTCTCCC CACAGAGATG GTTCTGTAAC CTCAACAAAG 4320 

TTGCTGTTTC CTTCTAAGGC AACTTCTGAG CTGAGTCATA GTGCCAAATC TGATGCCGGT 4380 

TTAGTGGGTG GTGGTGAAGA TGGTGACACT GATGATGATG GTGATGATGA T6ATGATGAC 4440 

A6AGGTAGTG ATGGCTTATC CATTCATAAG TGTATGTCAT GCTCATCCTA TAGAGAATCA 4500 

CAGGAAAAGG TAAT6AATGA TTCAGACACC CACGAAAACA GTCTTATG6A TCAGAATAAT 4560 

CCAATCTCAT ACTCACTATC TGAGAATTCT GAAGAAGATA ATAGAGTCAC AAGTGTATCC 4620 

TCAGACAGTC AAACTGGTAT GGACAGAAGT CCTGGTAAAT CACCATCAGC AAATGGGCTA 4680 

TCCCAAAAGC ACAATGATGG AAAAGAGGAA AATGACATTC AGACTGGTAG TGCTCTGCTT 4740 

CCTCTCAGCC CTGAATCTAA AGCATGGGCA GTTCTGACAA GTGATGAAGA AAGTGGATCA 4800 

GGGCAAGGTA CCTCAGATAG CCTTAATGAG AATGAGACTT CCACAGATTT CAGTTTTGCA 4860 

GACACTAATG AAAAAGATGC TGATGC3GATC CTGGCAGCAG GPGACTCAGA AATAAC TCCT 4920 

GC3ATTCCCAC AGTCCCCAAC ATCATCTGTT ACTAGCGAGA ACTCAGAAGT GTrOCAOSTT 4980 

TCAGAGGCAG AGGCCAGTAA TAGTAGCCAT GAGTCTCGTA TTGGTCTAGC TGAGGGGTTG 5040 

GAATCCGAGA AGAAGGCAGT TATACCCCTT GTGATCGTGT CAGCCCTGAC TTTTATCTGT 5100 

CTAGTGGTTC TTGTGGGTAT TCTCATCTAC TGGAGGAAAT GCTTCCAGAC TG CACACTT T 5160 

TACTTAGAGG ACAGTACATC CCCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 5220 

ATTTCAGATG ATGTCQGAGC AATTCCAATA AAGCACTTTC CAAAG CATG T TGCA6ATTTA 5280 

CATGCAAGTA GTGGGTTTAC TGAAGAATTT GAGACACTGA AAGAGTTTTA CCAGGAAGTG 5340 

CAGAGCTGTA CTGTTGACTT AGGTATTACA GCAGACAGCT CCAACCACCC AGACAACAAG 5400 

CACAAGAATC GATACATAAA TATCGTTGCC TATGATCATA GCAGGGTTAA GCTAGCACAG 5460 

CTTGCTGAAA AGGATGGCAA ACTGACTGAT TATATCAATG CCAATTATGT TGATGGCTAC 5520 

AACAGACCAA AAGCTTATAT TGCTGCCCAA GGCCCACTGA AATCCACAGC TGAAGATPTC 5580 

TGGAGAATGA TATGGGAACA TAATGTGGAA GTTATTGTCA TGATAACAAA CCTOGTGGAG 5640 

AAAGGAAGGA GAAAATOTGA TCAGTACTCG CCTGCOGATG GGAGTGAGGA GT AOSGG AAC 5700 

TTTCTGGTCA CTCAGAAGAG TGTGCAAGTG CTTGCCTATT ATACTGTGAG G AATTTTA CT 5760 

CTAAGAAACA CAAAAATAAA AAAGGGCTCC CAGAAAGGAA GACCCAGTGG ACGTGTGGTC 5820 

ACACAGTATC ACTACACGCA GTGGCCTGAC ATGGGAGTAC CAGAGTACTC CCTGCCAGTG 5880 

CTGACCTTTG TGAGAAAGGC AGCCTATGCC AAGCGCCATG CAGTGGGGCC TGTTGTCX3TC 5940 

CACTGCAGTG CTGGAGTTGG AAGAACAGGC ACATATATTG TGCTAGACA6 T ATGTT GCAG 6000 

CAQATTCAAC AOGAAGGAAC TGTCAACATA TTTGGCTTCT TAAAACACAT Ca3T TCAC AA 6060 

AGAAATTATT TGGTACAAAC T6AGGAGCAA TATGTCTTCA TTCATGATAC ACTGGTTGAG 6120 

GCCATACTTA GTAAAGAAAC TGAGGTGCTG GACAGTCATA TTCATGCCTA TGTTAATGCA 6180 

CTCCTCATTC CTGGACCAGC AGGCAAAACA AAGCTAGAGA AACAATTCCA GCTCCTGAGC 6240 

CAGTCAAATA TACAGCAGAG TGACTATTCT GCAGCCCTAA AGCAATGCAA CAGGGAAAAG 6300 

AATCGAACTT CTTCTATCAT CCCTGTGGAA AGATCAAGGG TTGGCATTTC ATCCCTGAGT 6360 

GGAGAAGGCA CAGACTACAT CAATGCCTCC TATATCATGG GCTATTACCA GAGCAATGAA 6420 

TTCATCATTA CCCAGCACCC TCTCCTTCAT ACX^^TCAAGG ATTTCTGGAG GATGATATGG 6480 

GACCATAATG CCCAACTGGT GGTTATGATT CCTGATGGCC AAAACATGGC AGAAGATGAA 6S40 

TTTGTrTACT GGCCAAATAA AGATGAGCCT ATAAATTGTG AGAGCTTTAA GGT CACTC TT 6600 

ATGGCTGAAG AACACAAATG TCTATCTAAT GAGGAAAAAC TTATAATTCA GGACTTTATC 6660 

TTAGAAGCTA CACAGGATGA TTATGTACTT GAAGTGAGGC ACTTTCAGTG TCCTAAATGG 6720 

CCAAATCCAG ATAGCCCCAT TAGTAAAACT TTTGAACTTA TAAGTGTTAT AAAAGAAGAA 6780 

GCTGCCAATA GGGATGGGCC TATGATTGTT CATGATGA6C ATGGAG6AGT GAOGG CAGGA 6840 

ACTTTCTGTG CTCTGACAAC CCTTATGCAC CAACTAGAAA AAGAAAATTC CGTGGATGTT 6900 

TACCAGGTAG CCAAGATGAT CAATCTGATG AGGCCAGGAG TCTTTGCTGA CATTGAGCAG 6960 

TATCAGTTTC TCTACAAAGT GATCCTCAGC CTTGTGAGCA O^AGGCAGGA AGAGAATCCA 7020 

TCCACCTCTC TGGACAGTAA TGGTGCAGCA TTGCCTGATG GAAATATAGC TG AGAG CTTA 7080 

GAGTCTTTAG TTTAACACAG AAAGGGGTGG GGGGACTCAC ATCTGAGCAT TGTTTTCCTC 7140 

TTCCTAAAAT TAOGCAGGAA AATCAGTCTA GTTCTGTTAT CTGTTGATTT CCCATCACCT 7200 

GACAGTAACT TTCATGACAT AGGATTCTGC CGCCAAATTT ATATCATTAA CA ATGTG TGC 7260 

CTTTTTGCAA GACTTGTAAT TTACTTATTA TGTTTGAACT AAAATGATTG AATTTTACAG 7320 

TATTTCTAAG AATGGAATTG TGGTATTTTT TTCTGTATTG ATTTTAACAG AAAA TTTCA A 7380 

TTTATAGAGG TTAGGAATTC CAAACTACAG AAAATGTTTG TTTTTAGTGT CAAATTTTTA 7440 

GCTGTATTTG TAGCAATTAT CAGGTTTGCT ACAAATATAA CTTTTAATAC AGTAGCCTGT 7500 

AAATAAAACA CTCTTCCATA TGATATTCAA CATTTTACAA CTGCAGTATT CACCTAAAGT 7560 

AGAAATAATC TGTTACTTAT TGTAAATACT GCCCTAGTGT CTCC ATGGAC CAAATTTATA 7620 

TTTATAATTG TAGATTTTTA TATTTTACTA CTGAGTCAAG TTTTCTOGTT CTGTGTAATT 7680 

GTTTAGTTTA ATGACGTAGT TCATTAGCTG GTCTTACTCT ACCAGTTTTC TGACATTGTA 7740 

TTGTGTTACX: TAAGTCATTA ACTTTGTTTC AGCATGTAAT TTTAACTTTT GTGGAAAATA 7800 

GAAATACCTT CATTTTGAAA GAAGTTTTTA TGAGAATAAC ACCTTACCAA ACATTGTTCA 7860 

AATGGTTTTT ATCCAAGGAA TTGCAAAAAT AAATATAAAT ATTGCCATTA AAAAAAAAAA 7920 
AAAAAAAAAA AAAAAAAAAA AAAA 
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Seq 10 KO: 573 Proce in' sequence: 
Protein Accession g: Bos sequence 



1 11 21 

1 I I 

MRILKRFLAC IQLUVCRIjD WAKGYYRQQR 
QSPINIDEDL TQfVNVKIiKKL KFQGNDKTSL 
FKASKITFHW GKOiKSSDGS EHSLEGQKFP 
ILFBVGTEa? UJPKAIIDGV ESVSRFOKQA 
TDTVDWIVFK DTVSISESQL AVFCBVLTOO 
TGKCBIHEAV CSSEPENVQA DPENYTSLLV 
HBPLTDGYQD LGAILNNLLP NMSYVLQIVA 
liIGTEEIIKE EBEXaCDIBEG AIVNPGRDSA 
RSPTRGSBPS GKGDVPWrSr* NSTSQPVTKL 
GSKTVLRSPa HNLSGTAESL NTVSZTEYEE 
EZnSQGYIFS SEIPETITYD VLIPES ARNA 
TAQPDVGSGR ESPLOTNYTE IRVDESEKTT 
TBVTPHAFTP SSRQQDIjVST VNWYSQTTQ 
liNTTPAASSS DSALHATPVF PSVDVSFESI 
ZLPQVTSATE SDKVPLHASIi PVAGGDLLLE 
KTLMPSQVEP PSSZ3AMMKAR SS6PEPSYAL 
SLFSGPSHIP IPKSSIjITPT ASLLQPTKAI* 
PVSVABFTYT TSVFCTDNKA LSKSEIIYOI 
LNASLQETSV SISSTKGMFP GSLAHTTTKV 
KPVLSANSEP ASSDPASSEM LSPSTQLLPY 
AVPSDPILVE TPKVDKISST MLHDIVSNSA 
TISYASEKYE PVIdiKSESSH QWPSLYSKD 
EPLKTLINKL IKSDEXLTST KSSVTGKVFA 
PHRDGSVTST KLLFPSKATS ELSHSAKSDA 
KCMSCSSYRE SQEKVMNDSD THENSUmQK 
SPGKSPSANG LSQKHNDGKE ENDIQTGSAL 
ENETSTOFSF AIXmEKDADG IIAAGDSEIT 
HESRIGLAE6 LBSEKKAVIP LVIVSALTFI 
VISTPPTPIP PISDDVGAIP IKHFPKHV3U) 
TADSSNHPDN KHKKRYINIV AYDHSRVKLA 
Q6PLKSTAED FWRMIWEHNV EVIVMITNLV 
VLAYYTVRNF TLRNTKIKKG SQKGRPSGRV 
AKRHAVGPW VHCSAGVGRT GTYIVLDSML 
QYVFIHDTLV EAILSKETEV LDSHIHAYVN 
SAALKQCNRE KNRTSSIIPV ERSRVGISSL 
HTIKDFWRMI HOBNAQLWM IPOGQNMAED 
MBEKLIIQDF XLEATQDDYV LEVRHFQCPR 
VHDEHGGVTA GTFCALTTLM HQLBKENSVD 
SLVSTRQEEN PSTSLDSNGA ALPDCaiZAES 



31 41 51 

I I I 

KLVBEIGWSY TGAIAIQKNWG KKYPTCNSPK 60 

ENTPIHNTGK TVBIULTSDY RVSGGVSEKV 120 

LEKQIYCFDA DRFSSPBKAV KGKGKLRALS 180 

ALDPPILLSL LPMSTOKYYI YNGSLTSPPC 240 

QSGYVMIiKDy LQNNFRBQQY KFSRQVFSSY 300 

THERPRWYD TMIEKFAVLY QQLDGEDQTK 360 

ICTNGLYGKY SDQLIVDMPT DNPEIJ3LFPE 420 

TOQIRiOCEPQ ISTTTHYNRI GTKXNEAKTN 480 

ATEKDISLTS QTVTEU»PHT VEGTSASLHD 540 

BSUjTSFICU) TGAEDSSGSS PATSAIPFZS 600 

SEDSTSSGSE ESLKDPSMBG NVHFPSSTDI 660 

KSFSAGPVMS QGPSVTDLra PHYSTFAYFP 720 

PVYNGETPLO PSYSSEVPPL VTPLLLDNQI 780 

LSSYDGAPLL PFSSASFSSE LFRHLHTVSQ 840 

PSLAQYSDVL STTHAASBTL EFGSESGVLY 900 

SDNBGSQHIF TVSYSSAIPV HDSVGVTYQG 960 

SGDGEWSGA5 SDSEFLLPDT DGLTALKISS 1020 

ETELQIPSFN EMVYPSESTV MPNMYDIIVNK 1080 

FDHEISQVPE NNPSVQPTHT VSQASGDTSL 1140 

ETSASFSTEV LLQPSFQASD VDTLLKTVLP 1200 

SSENMLHSTS VPVFDVSPTS HMHSASLQGL 1260 

EliFQTANIiEI NQAHPPKGRH VFATPVLSID 1320 

GIPTVASDTF VSTDHSVPIG NGHVAITAVS 1380 

6LVGGGEDG0 TDDOGOOODD 0R6SDGLSIK 1440 

KPISYSLSEN SEEDNRVTSV SSDSQTOOR 1500 

LPLSPESKAW AVLTSDEESG SGQCTSDSUI 1560 

PGFPQSPTSS VTSENSEVFH VSEAEASNSS 1620 

CLWLVGILI YWRKCFQTAH PYLBDSTSPR 1680 

LHASSGFTEE FBTLKEFYQE VQSCTVDLGI 1740 

QLAEKDGKLT DYZHANYVDG YKRPKAYIAA 1800 

EKGRRKCOQY WPADGSEEYG NFLVTQKSVQ 1860 

VTQYHYTQWP DMGVPEYSLP VLTFVRKAAY 1920 

QQIQHBGTVN IPGFLKHIRS QRNYLVQTEE 1980 

ALLIPGPAGK TKLEKQFQLL SQSNIQQSDY 2040 

SGEGTDYINA SYIMGYYQSN EFIITQHPLL 2100 

EFVYWPNXDB PINCESPKVT LMAEEHKCLS 2160 

HPNPDSPISK TFELISVIKB EAANRDGFMI 2220 

VYQVAia4IUL MRPGVPADIB QYQFLYKVZL 2280 
hEShW 



Seq ID NO: 574 DNA sequence 

Kucleic Acid Accession ft: Eos sequence 

Coding sequence: 148-4518 

1 11 21 31 41 51 

I I I I I i 

CACACATAOG CACGCACGAT CTCACTTGGA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 120 

CGGCGAGGGG C0GCAGAC06 TCXGGAAATG OGAATCCTAA AGOGTTTCCT OGCTTGCATT 180 

CAGCTCCTCT GTGTTTGCOG CCTGGATTGG GCTAATGGAT ACTACA6ACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACa. GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTXK3AAATTA ATCTCACTAA TGACTACCGT 480 

GTCAGOGGAG GAGTTTCAGA AATQGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

AAATGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGATGOGGAC CGATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTT6 720 

GATTTCAAAG OGATTATTGA TGGAGTOGAA AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTCGAGTCXaT TTATGATAOC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTOGACAT GCCTACTGAT 1380 

AATCCTGAAC rTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGCGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA ISOO 

AACXyUiATCA GGAAAAAGGA ACCCXZAGATT TCTACCACAA CACACTACAA TCGCATAGGG 1560 

ACGAAATACA ATGAAGCCAA GACTAACGGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCX: 1980 

GAAAACCCAG A6ACAATAAC ATATGATGTC CTTATAiOCAG AATCT6CTA6 AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAG6 TTCAGAA6AA TCACTAAAGG ATOCTTCTAT OGAGGGAAAT 2100 
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GTOTGGTTTC CTAGCTCTAC AGACATAACA GCACAGCCaj ATGTTGGATC AGGCACAGAG 2160 

A6CTTTCTCC AGACTAATTA CACTGAGATA CGTg WG ATO AATCTGAGAA GACAAOaAG 2220 

TCCTTTTCTG CAOGGCCAGT GAIGTCACAG GG T OXrrCAG TTACftGATCT G6AAATGCCA 2280 

CATTATTCTA CCTTPGCCTA CTTCCX^ACT 6AGGTAACAC CTCATGCTTT TACCOCATOC 2340 

TCCAGACAAC AGGATTTGGT CTCXIAOGGTC AA0C5TGGTAT ACTCGCAGAC AACCCAACOG 2400 

GTATACAATG CAGAGGCCAG TAATAGTAGC CATGAGTCTC GTATTGGTCT AG CTGAG GGG 2460 

TTCGAATOOG AGAAGAAGGC AGTTATACCC CTTGTGATCG TGTCAGCCCT GACTTTTATC 2S20 

TCTCraGIGG TTCrTGTGGG TATTCTCATC TACTC3GAGGA AATGCTTCCA GACTGC ACAC 2S80 

TTTTACTTAG ACGACAGTAC ATCCCCTAGA GTTATATCCA CACCTCCAAC ACCTATCTTT 2640 

CCAATTTCAG ATGATGTOSG AGCAATTCCA ATAAAGCACT TTCCAAAGCA TGTTGCAGAT 2700 

TTACATGCAA GTAGTGGGTT TACTGAAGAA TTTGAGACAC TGAAAGAGTT TTACCAGGAA 2760 

GTGCAGAGCT GTACTCTTGA CTTAGGTATT ACAGCAGACA GCTCCAACCA CCCAGACAAC 2820 

AAGCACAAGA ATCGATACAT AAATATCGTT GCCTATGATC ATAGCAGGGT TAAGCTAGCA 2880 

CAGCTTCCTG AAAAGGATGG CAAACTGACT 6ATTATATCA ATGCCAATTA TGTTGATGGC 2940 

TACAACAGAC CAAAAGCTTA TATTGCTGCC CAAGGCCCAC TGAAATCCAC AGCTGAAGAT 3000 

TTCTGGAGAA TGATATGGGA ACATAATGTG GAAGTTATTG TCATGATAAC AAACCTCGTG 3060 

GAGAAAGGAA GGAGAAAATG T6ATCAGTAC TGGCCTGCCG ATGGGAGTGA GGAGT AOGGG 3120 

AACTTTCTGG TCACTCAGAA GAGTGTGCAA GTGCTTGCCT ATTATACTGT GAGGAATTTT 3180 

ACTCTAAGAA ACACAAAAAT AAAAAAGGGC TCCCAGAAAG GAAGACOCAG TGGACGTGTC 3240 

GTCACACAGT ATCACTACAC GCAGTGGCCT GACATGGGAG TACCAGAGTA CTCCCTGCCA 3300 

GTGCrGACCT TTGTGAGAAA GGCAGCCTAT GCCAAGCGCC ATGCAGTGGG GCCTGTTGTC 3360 

GTCCACTGCA GTGCTGGAGT TGGAAGAACA GGCACATATA TTGTGCTAGA C3U3TATGTTG 3420 

CAGCAGATTC AACACGAAGG AACTGTCAAC ATATTTGGCT TCTTAAAACA CATCOGTTCA 3480 

CAAAGAAATT ATTTGGTACA AACTGAGGAG CAATATGTCT TCATTCATGA TACACTGGTT 3540 

GAGGCCATAC TTAGTAAAGA AACTGAGGTG CTGGACAGTC ATATTCATGC CTATGTTAAT 3600 

GCACTCCTCA TTCCTGGACC AGCAGGCAAA ACAAAGCTAG AGAAACAATT CCAGCTCCTG 3660 

AGOCAGTCAA ATATACAGCA GAGTGACTAT TCTGCAGCCC TAAAGCAATG CAACAGGGAA 3720 

AAGAATCGAA CTTCTTCTAT CATCCCTQTG GAAAGATCAA GGGTTGGCAT TTCATCCCTG 3780 

AGTGGAGAAG GCACAGACTA CATCAATGCC TOCTATATCA TGGGCTATTA CCAGAGCAAT 3840 

GAATTCATCA TTACCCAGCA CCCTCTCCTT CATACCATCA AGGATTTCTG GAGGATGATA 3900 

TGGGACCATA ATGCCCAACT GGTGGTTATG ATTCCTGATG GCCAAAACAT GGCAGAAGAT 3960 

GAATTTGTTT ACIGGCCAAA TAAAGATGAG CCTATAAATT GTGAGAGCTT TAAGGTCACT 4020 

CTTATGGCTG AAGAACACAA ATGTCTATCT AATGAGGAAA AACTTATAAT TCAGGACTTT 4080 

ATCTTAGAAG CTACACAGGA T6ATTATGTA CTT6AAGT6A GGCACTTTCA GTGTCCTAAA 4140 

TGGCCAAATC CAGATAGCCC CATTAGTAAA ACTTTTGAAC TTATAAGTGT TATAAAAGAA 4200 

GAAGCTGCCA ATAGGGATGG GCCTATGATT GTTCATGATG AGCATGGAGG AGTGAOGGCA 4260 

GGAACTTTCT GTGCTCTGAC AACCCTTATG CACCAACTAG AAAAAGAAAA TTCCGTGGAT 4320 

GTTTACCAGG TAGCCAAGAT GATCAATCTG ATGAGGCCAG GAGTCTTTGC TGACATTGAG 4380 

CAGTATCAGT TTCTCTACAA AGTGATCCTC AGCCTTGTGA GCACAACGCA GGAAGAGAAT 4440 

CCATCCACCT CTCTGGACAG TAATGGTGCA GCATTGCXTTG ATGGAAATAT AG CTGAGAG C 4500 

TTAGAGTCTT TAGTTTAACA CAGAAAGGGG TGGGGGGACT CACATCTGAG CATTGTTTTC 4560 

CTCTTCCTAA AATTAGGCAG GAAAATCAGT CTAGTTCTGT TATCTGrTGA TTTOCC ATCA 4620 

CCTGACAGTA ACTTTCATGA CATAGGATTC TGCCGCCAAA TTTATATCAT TAACA ATGTG 4680 

TGCCTTTTTG CAAGACTTGT AATTTACTTA TTATGTTTGA ACTAAAATGA TTGAATTTTA 4740 

CAGTATTTCT AAGAATGGAA TTGTGGTATT TTTTTCTGrA TTGATTTTAA CAGAAAATTT 4800 

CAATTTATAG AGGTTAGGAA TTCCAAACTA CAGAAAATGT TTGTTTTTAG TGrCAAATTT 4860 

TTAGCTGTAT TTGTAGCAAT TATCAGGTTT GCTAGAAATA TAACTTTTAA TACAGTAGCC 4920 

TGTAAATAAA ACACTCTTCC ATATGATATT CAACATTTTA CAACTQCAGT ATTCACCTAA 4980 

AGTAGAAATA ATCTGTTACT TATTGTAAAT ACTGCCCTAG TGTCTCCATG GACCAAATTT S040 

ATATTTATAA TTGTAGATTT TTATATTTTA CTACTGAGTC AAGTTTTCTA CTTCTCTtSTA 5100 

ATTGTTTAGT TTAATGACGT AGTTCATTAG CTGGTCTTAC TCTACCAGTT TTCTGACATT 5160 

GTATTGTGTT ACCTAAGTCA TTAACTTTGT TTCAGCATGT AATTTTAACT TTTGTGGAAA 5220 

ATAGAAATAC CTTCATTTTG AAAGAAGTTT TTATGAGAAT AACACCTTAC CAAACATTGT 5280 

TCAAATGGTT TTTATCCAAG GAATTGCAAA AATAAATATA AATATTGCCA TTAAAAAAAA 5340 
AAAAAAAAAA AAAAAAAAAA AAAAAAA 



Seq ID NO: S7S Protein sequence: 
Protein Accession Eos sequence 

1 11 21 31 41 51 

j I I i I I 

MRILKRFLAC IQLLCVCRLD WANGYYRQQR KLVEEIGWSY TGALRQKNWG KKYPTCNSPK 60 

QSPINIDEDL TQVNVNLKKL KFQGWDKTSL ENTPIHNTGK TVEINLTNDY RVSG6VSSMV 120 

PKASKITPHW GKCNMSSDGS BRSLEGQKPP LEMQIYCFDA DRFSSFEEAV KGKGKLRALS 180 

ILPEVGTEEN LOFKAIIDGV ESVSRFGKQA ALDPPILLNL LPNSTDKYYI VNGSLTSPPC 240 

TDTVDWIVFK OTVSISESQL AVFCEVIiTKQ QSGYVMLMDY LQNNPRBQQY KPSRQVPSSY 300 

TGKEBZBEAV CSSEPENVQA DPENYTSIiLV TWERFRWYD TNIBKFAVLY QQIiDGEDQTK 360 

HEPLTDGYQD LGAILNNLLP NMSYVLQIVA ICTUGLYGKY SDQIiIVDMPT DMPEU5LFPE 420 

LIGTEEIIKE EEEGKDIEEG AIVNPGRDSA TNQIRECKEPQ ISTTTHYNRI GTKYNEAKTN 480 

RSPTRGSEFS GKGDVPNTSL NSTSQPVTKL ATEKDISLTS QTVTELPPHT VBGTSASLND 540 

GSKTVtiRSPK MNLSGTAESL NTVSITEYEE ESLLTSFKLD TGAEDSSGSS PATSAIPFIS 600 

DIISQGYIFS SENPETITYD VLZPESARNA 5EDSTSSGSB BSLKDPSMBG KVWFPSSTOZ 660 

TAQPDVGSGR ESFLQTNYTE IRVDESBKTT KSPSAGPVMS QGPSVTDLQI PHYSTFAYFP 720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYKAEASNS SHESRIGLAE GLESEKKAVI 780 

PLVIVSALTF ICLWLVGIL lYWRKCFQTA HFYLEDSTSP RVISTPPTPl PPISXSDVGAI 840 

PIKHFPKEVA DLHASSGFTE BPETLKEFYQ EVQSCTVDLG ITAOSSmiPD NKHKKRYIMI 900 

VAYDHSRVKL AQLAEKDGKL TDYINANYVD GYNRPKAYIA AQGPLKSTAE DFWRMIWEHN 960 

VEVrVMXTNL VEKGRRKCDQ YWPADGSEEY GNPLVTQKSV QVLAYYTVHN FTLRNTKIKK 1020 

GSQKCRPSGR WTQYHYTQW PDMGVPEYSL PVLTPVRKAA YAKRHAVGPV WHCSAGVGR 1080 

TGTYIVIiDSM LQQZQHEGTV NIFGFLKHIR SQRKYLVQTE EQYVFIHDTL VEAILSKETC 1140 

VLD5HIKAYV KAUiIFGPAG KTKLEKQFQL LSQSKtQQSD YSAALKQaiR EKHSTSSIXP 1200 

VERSRVGISS LSGEGTDYIN ASYIMGYYQS NEPIITQHPL LHTIXDFWRM IWDHNAQLW 1260 

MIPDGQNMAE DEFVYWPNKD EPINCESFKV TLMAEEHKCL SNEEKLIIQD FILEATQDDY 1320 

VLEVRHFOCP KWPNPDSPIS KTFELISVIK BEAANRDGPM IVHDEHGGVT AGTPCALTTL 1380 

MHQLEKHTSV DVYQVAKHIN LMRPGVFADI BQYQPLYKVI LSLVSTRQEB NPSTSLOSKG 1440 
AALKX3NIAE SLBSLV 
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Seq ZD NO: 576 DSSA sequence 

Nucleic Acid Accession S: EOS sequence 

Codlzig sequence: 148-4494 

1 11 21 31 41 51 

I I 1 I I ! 

aCACATAOG C3U3SCAOSAT CTCACTZCQA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTOG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 120 

OGGCGAGGGG C0GCAGACG6 TCTGGAAATG CGAATCCTAA AGCGTTTCCr CGCTTGCATT 180 

CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACRGAGAAAA 240 

CTTGTTGAAG AG A TTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AflATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATT GGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACCGT 480 

GTCAGCGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

AAATGCAATA TQTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGATGCAGAC OGATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTCTTPGAGG TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAG CGATTATTGA TGGAGTCGAA AGTSTTAGTC GTTTTGGGAA GCAGGCTGCT 780 

TTACATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTAC ATTTAC 840 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTOGAQAGCA ACAG TACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCXTT TCTTGTTACA 1140 

TGGGAAAGAC CTOGAGTCGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGATG GAjQAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGOCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGOGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TOGCATAGGG 1560 

ACGAAATACA ATGAAGGCAA GACTAACOGA TGCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TOCACATATG 1800 

AACTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCA6 AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATOCTTCPAT GGAGGGAAAT 2100 

GTGTG6TTTC CTAGCTCTAC AGACATAACA GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

AGCrrrCTCC AGACTAATTA CACTGAGATA CXSTGTTGATG AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATTTGGT CTCCAOGGTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

GTATACAATG AGGCCAGTAA TAGTAGCCAT GAGTCTCGTA TTGGTCTAGC TGAGGGGTTG 2460 

GAATCCGAGA AGAAGGCAGT TATAOCCCTT GTGATCGTGT CAGCCCTGAC TTTTAT CTGT 2520 

CTAGTGGTTC rTGTGGGTAT TCTCATCTAC TGGAGGAAAT GCTTOCAGAC TGCACACTTT 2580 

TACTTAGAGG ACAGTACATC CCCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 2640 

ATTTCAGATG ATGTCGGAGC AATTCCAATA AAGCACTTTC CAAAGCATGT TGCAGATTTA 2700 

CATGCAAGTA GTGGGTTTAC TGAAGAATTT GAGGAAGTGC AGAGCTGTAC TGTTGACTTA 2760 

GGTATTACAG CAGACAGCTC CAACCACCCA GACAACAAGC ACAAGAATCX; ATACATAAAT 2820 

ATCGTTGCXrr ATGATCATAG CAGGGTTAAG CTAGCACA6C TTGCT6AAAA GGATGGCAAA 2880 

CTGACTGATT ATATCAATGC CAATTATGTT GATGGCTACA ACAGACCAAA AGCTTATATT 2940 

GCTGCCCAAG GOCCACTGAA ATCCACAGCT CAAGATTTCT GGAGAATGAT ATGGQAACAT 3000 

AATGTGGAAG TTATTGTCAT GATAACAAAC CTCGTGGAGA AAGGAAGGAG AAAATGTGAT 3060 

CAGTACTGGC CTGCOGATGG GAGTGAGGAG TACGGGAACT TTCTGGTCAC TCAGAAGAGT 3120 

GTGCAAGTGC TTGCCTATTA TACTGTGAGG AATTTTACTC TAAGAAACAC AAAAATAAAA 3180 

AAGGGCTOCC AGAAAGGAAG ACCCAGTGGA CGTGTGGTCA CACAGTATCA CTACAOGCAG 3240 

TGGOCTGACA TGGGAGTACC AGAGTACTCC CTGCCAGTGC TGACCTTTGT GAGAAAGGCA 3300 

GCCTATGCCA AGCGCCATGC A6TGGGGCCT GTTGTCGTCC ACTGCAGTGC TGGAGTTGGA 3360 

AGAACAGGCA CATATATTGT GCTAGACAGT ATGTTGCAGC AGATTCAACA CGAAGGAACT 3420 

GTCAACATAT TTGGCTTCTT AAAACACATC CGTTCACAAA GAAATTATTT GGTACAAACT 3480 

GAGGAGCAAT ATGTCTTCAT TCATGATACA CTGGTTGAGG CCATACTTAG TAAAGAAACT 3540 

GAGGTGCTGG ACAGTCATAT TCATGCCTAT GTTAATGCAC TCCTCATTCC TGGACCAGCA 3600 

GGCAAAACAA AGCTAGAGAA ACAATTCCAG CTCCTGAGGC AGTCAAATAT ACAGCAGAGT 3660 

GACTATTCTG CAGCCCTAAA GCAATGCAAC AG6GAAAAGA ATOSAACTTC TTCTATCATC 3720 

CCTGTGGAAA GATCAAGGGT TGGCATTTCA TCCCTGAGTG GAGAAGGCAC AGACTACATC 3780 

AATGCCTCCT ATATCATGGG CTATTACCAG AGCAATGAAT TCATCATTAC CCAGCACCCT 3840 

CTCCTTCATA CCATCAAGGA TTTCTGGAGG ATGATATGGG ACXATAATGC CCAACTGGTG 3900 

GTTATGATTC CTGATGGCCA AAACATGGCA GAAGATGAAT TTGTTTACTG GCCAAATAAA 3960 

GATGAGCCTA TAAATTGTGA GAGCTTTAAG' GTCACTCTTA TGGCTGAAGA ACACAAATGT 402 0 

CTATCTAATG AGGAAAAACT TATAATTCAG GACTTTATCT TAGAAGCTAC ACAGGATGAT 4080 

TATGTACTTG AAGTGAGGCA CTTTCAGTGT CCTAAATGGC CAAATCCAGA TAGGCCCATT 4140 

AGTAAAACTT TTGAACTTAT AAGTGTTATA AAAGAAGAAG CTGCCAATAG GGAT0G6CCT 4200 

ATGATTGTTC ATGATGAGCA TQGAGGAGTG ACGGCAGGAA CTTTCTGTGC TCTGACAACC 4260 

CTTATGCACC AACTAGAAAA AGAAAATTCC GTGGATGTTT ACCAGGTAGC CAAGATGATC 4320 

AATCTGATGA GGCCAGGAGT CTTTGCTGAC ATTXSAGCAGT ATCAGTTTCT CTACAAAGTG 4380 

ATCCFCAGCC TTGTGAGCAC AAGGCAGGAA GAGAATCCAT OCACCTCTCT GGACAGTAAT 4440 

GGTGCAGCAT T6CCTGATGG AAATATASCT GAGAGCTTAG AGTCTTTAGT TTAACACAGA 4500 

AAGGGGTGGG GGGACTCACA TCTGAGCATT GTTTTCCTCT TCCTAAAATT AGGCAGGAAA 4560 

ATCAGTCTAG TTCTGTTATC TGTTGATTTC CCATCACCTG ACAGTAACTT TCATGACATA 4620 

GGATTCTGCC GCCAAATTTA TATCATTAAC AATGTGTGCC TTTTTGCAAG ACTTGTAATT 4680 

TACTTATTAT GTTTGAACTA AAATGATTGA ATTTTACAGT ATTTCTAAGA ATGGAATT6T 4740 

GGTATTTTTT TCTGTATTGA TrTTAACAGA AAATTTCAAT TTATAGAGGT TAGGAATTCC 4800 
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AAACTACAGA AAATGTTTGT TnTAGTGTC AAATTTTTAG CTGTATTTGT AGCAATTATC 4860 

AGGTTTGCTA GAAATATAAC TTTTAATACA GTAGCXTrGTA AATAAAACAC TCTTCCATAT 4920 

GATATTCAAC ATTTTACAAC TGCAGTATTC AOCTAAAGTA GAAATAATCT GTTACTTATT 4980 

GTAAATACTG COCTAGTGTC TCCATGGACC AAATTTATAT TTATAATTGT AGATTTTTAT 5040 

ATTTTACTAC T6AGTCAAGT TTTCTAGTTC TCTGTAATTG TTTAGTTTAA TGAOGTAGTT 5100 

CATTAGCTGG TCTTACTCTA CCAGmTCT GACATTGTAT TGTGTTACCT AAGTCATTAA 5160 

CrrTGTTTCA GCATGTAATT TTAACTTTTG TGGAAAATAC AAATACXTTTC ATTTTGAAAG 5220 

AAGTTTTTAT GAGAATAACA CCTTAOCAAA CATTGTTCAA ATGGTTTTTA TCCAAGGAAT S3 80 

TGCAAAAATA AATATAAATA TTGCCATTAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 5340 
AAA 

Seq ID NO: 577 Protein sequence: 
Protein Accession #: COS sequence 

1 11 21 31 41 51 

I 1 1 I I t 

MRILKRFLAC IQLLCVGRLD HAMGYYRQQR KLVEEIGHSY TGALHQKNWG KKYPTOISPK 60 
QSPINIDEDL TQVNVNLKKL KPQGWDKTSL ENTPnCJTGK TVEINLTKDY RVSGGVSEMV 120 
FKASKITFHW GKCKMSSDGS EHSLEGQKPP LEKOIYCFDA HRPSSFEEAV KGKGKLRALS 180 
ILFEVGTEEN L0FKAIID6V ESVSRFGKQA ALOPPZLLKL LFNSTDKYYI YNGSLTSPPC 240 
TDTVDHIVFK DTVSXSESQL AVFCBVLTMQ QSGWHIMDY LQUNFREQQY KFSRQVFSSY 300 
TGKEEIHEAV CSSEPENVQA DPEKY7SLLV TWERPRWYO IMIEKPAVLY QQLOG&DQTK 360 
HEPLTDGYQD LGAILNNLU NMSYVLQIVA ICTNGLYGKY SDQLIVDMPT DNPBLDLFPE 420 
IiIGTESIIKE EEEGia>IBEX3 AIVNPGROSA TKQIRKKEPQ ISTTTOYNRI GTKYNEAiCrN 480 
RSPTRGSEFS GKCTVPNTSL NSTSQPVTKL ATBKDISLTS QTVTSLPPHT VEGTSASLND 540 
GSKTVLRSPH MNI^GTAESL NTVSITEYBB BSLLTSFKLD TGAEDSSGSS PATSAIPPIS 600 
ENISQGYIFS SENPETITYD VLIPESARKA SEDSTSSGSE ESLKDPSMEG NVWFPSSTDI 660 
TAQPOVGSGR ESFLQTNYTE IRVDESEKTT KSFSA6PVMS QGPSVTDLEM PHYSTFAYFP 720 
TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNEASNSS HESRI6LAEG LESSKKAVIP 780 
LVIVSALTPI CLWLVGII*! YHRKCFQTAH FYLEDSTSPR VISTPPTPIP PZSDDVGAIP 840 
IKHFPKHVAD LHASSGFTBE FEEVQSCTVD U3ITADSSNH PDKKHKNRYI NIVAYDHSRV 900 
KLAQLAEKDG KLTDYINANY VDGYNRPKAY lAAQGPLKST AEDPWRMIWE HNVEVIVMIT 960 

NIiVEKGRRKC DQYWPADGSE EYGMFLVTQK SVQVLAYYTV RNFTLRNTKI KKGSQRGRP5 1020 

(atWTQYHYT QWPDMGVPEY SLFVLTFVRK AAYAXRBAVG PVWHCSAGV GRTGTYZVLO 1080 

SNLQQIQBEG TVNZPGFLKH IRSQRNYIiVQ TEEQYVFJHD TLVEAILSKE TEVZiDSHZBA 1140 

YVKALLIPGP AGKTKLERQP QLLSQSNIQQ SDYSAALKQC NREKNRTSSI IPVERSRVGI 1200 

SSLSGEGTDY INASYIMGYY QSNEFIITQH PLLRTIKDFW RMIWDHNAQL WMIPDGQNM 1260 

AEDEFVYWPN KDBPINCESF fCVTLMAEEKK CLSNBBKliII QOFIIiEATQO DYVLEVRHFO 1320 

CPKHPNPDSP ISKTPELZSV IKEEAANRDG PMIVBDEHGG VTAGTFCALT TLMHQLEKEH 1380 

SVDVYQVAKM INU4RPGVFA DIEQYQPLYK VILSLVSTRQ EENPSTSLDS NGAALPDGNI 1440 
AESLESLV 



Seg ID NO: 578 DMA sequence 

Nucleic Acid Accession EOS sequence 

Coding sequence; 501-4514 

1 11 21 31 41 SI 

i I I t } ) 

CACACATACG CACGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGGGGCA 120 

CGGCGAGGGG CCGCAGACCG TCTGGAAATG CGAATCCTAA AGCGTTTCCT CGCTTGCATT 180 

CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAT TGGGGAAAGA 300 

AATATGCAAC ATGTAATAGC GCAAAACAAT CTOCTATCAA TATTGATGAA GATCTTACAC 360 

AAGTAAATGT GAATCTTAAG AAACTTAAAT TTCAGG6TTG GGATAAAACA TCATTGGAAA 420 

ACACATTCAT TCATAACACT GGGAAAACAG TGGAAATTAA TCTCACTAAT 6ACTACCGTG 480 

TCAGCGGAGG AGTTTCAGAA ATGGTGTTTA AAGCAAGCAA GATAACTTTT CACTGGGGAA 540 

AATGCAATAT GTCATCTGAT GGATCAGAGC ATAGTTTAGA AGGACAAAAA TTTCCACTTG 600 

AGATGCAAAT CTACTGCTTT GATGCGGACC GATTTTCAAG TTTTGAGGAA GCAGTCAAAG 660 

GAAAAGGGAA GTTAAGAGCT TTATCCATTT TGTTTGAGGT TGGGACAGAA GAAAATTTGG 720 

ATTTCAAAGC GATTATTGAT GGAGTOGAAA GTGTTAGTCG TTTTGGGAAG CAGGCTGCTT 780 

TAGATCCATT C3VTACTGTTG AACCTTCTGC CAAACTCAAC TGACAAGTAT TACATTTACA 840 

ATGGCTCATT GACATCTCCT CCCTGCACAG ACACAGTTGA CTGGATTGTT TTTAAAGATA 900 

CAGTTAGCAT CTCTGAAAGC CAGTTGGCTG TTTTTTGT6A AGTTCTTACA ATGCAACAAT 960 

CTGGTTATGT CATGCTGATG GACTACTTAC AAAACAATTT TOGAGAGCAA CAGTACAAGT 1020 

TCTCTAGACA GGTGTTTTCC TCATACACTG GAAAGGAAGA GATTCATGAA GCAGTTTGTA 1080 

GTTCAGAACC AGAAAATGTT CAGGCTGACC CAGAGAATTA TACCAGCCTT CTTGTTACAT 1140 

GOGAAAGACC TCGAGTCGTT TATGATACCA TGATTGAGAA GTTTGCAGTT TTGTACCAGC 1200 

AGTTGGATGG AGAGGACCAA ACCAAGCATG AATTTTTGAC AGATGGCTAT CAAGACTTGG 1260 

GTGCTATTCT CAATAATTTG CTACCCAATA TGAGTTATGT TCTTCAGATA GTAGCCATAT 1320 

GCACTAATGG CTTATATGGA AAATACAGCG ACCAACTGAT TGTCGACATG CCTACTGATA 1380 

ATCCTGAACT TGATCTTTTC CCTGAATTAA TTGGAACTGA AGAAATAATC AAGGAGGAGG 1440 

AAGAGGGAAA AGACATTGAA GAAGGCGCTA TTGTGAATCC TGGTAGAGAC AGTGCTACAA ISOO 

ACCAAATCAG QAAAAAGGAA CCCCAGATTT CTACCACAAC ACACTACAAT CGCATAGGGA 1560 

CGAAATACAA TGAAGCCAAG ACTAACCGAT CCCGAACAAO AGGAAGTGAA TrCTCTGGAA 1620 

AGGGTGATGT TCCCAATACA TCTTTAAATT CCACTTCCCA ACX3«STCACT AAATTAGCCA 1680 

CAGAAAAAGA TATTTCCTTG ACTTCTCAGA CTGTGACTGA ACTGCCACCT CACACTGTGG 1740 

A AGGTA CTTC AGCCTCTTTA AATGATGGCT CTAAAACTOT TCTTAGATCT CCACATATGA 1800 

ACTTGTCGGG GACTGCAGAA TCCTTAAATA CAGTTTCTAT AACAGAATAT GAGGAGGAGA 1860 

GTTTATTGAC CAGTTTCAAG CTTGATACTG GAGCTGAAGA TTCTTCAGGC TCX»GTCCOG 1920 

CAACTTCTGC TATCCCATTC ATCTCTGAGA ACATATOCCA AGGGTATATA TTTTCCTCOS 1980 

AAAACCCAGA GAGAATAACA TATGATGTCC TTATACCAGA ATCTGCTAGA AATGCTTCCG 2040 

AAGATTCAAC TTCATCAGGT TCAGAAGAAT CACTAAAGGA TCCTTCTATG GAGGGAAATG 2100 

TGTGGTTTCC TAGCTCTACA GACATAACAG CACAGCCOGA TGTTGGATCA GGCAGAGAGA 2160 

G CTTTC TOCA GACTAATTAC ACR3AGATAC 6TGTTGATGA ATCTGAGAAG ACAACCAA6T 2220 

OCTTTTCTGC AGGGCCAGTG ATGTCACAGG GTCOCTCAGT TACAGATCTG GAAATGCCAC 2280 
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ATTATTCTAC CTTTGCCTAC TTCCCAACTG AGGTAACACC TCATGCTTTT ACCCCATCCT 2340 

CXaCACAACA GGATTTGGTC TCCACGGTCA AOCTGGTATA CTOGCAGACA ACCCAAOOGG 2400 

TATACAATGA GGCCAGTAAT AGTAGCCATG AC?rCTCGTAT TGGTCTAGCT GAGGOGTTGG 2460 

AATCO^GAA GAAGGCAGTT ATACXXCTTG TGATCGTGTC AGCCCTGACT TTTAT CTGTC 2520 

TAGTGGTTCT TGTGGGTATT CTCATCTACT G6AG6AAATG CTTCCAGACT GCAMTTTT 2580 

ACTTAGAGGA CAGTACATCC CCTAGAGTXA TATCCACAOC TCCAACAiCCT ATCTTPOCSU 2640 

TTTCAGATGA TGTCX»2AGCA ATTCCAATAA AGCACTTTCC AAAGCATOTT GCAGATTTAC 2700 

ATGCAAGTAG TGGGTTTACT GAAGAATTTG AGACACTGAA AGAGTTTTAC CAGGAA6TGC 2760 

AGAGCTGTAC TGrTGACTTA GGTATTACAG CAGACAGCTC CAACCACCCA GACAACAAGC 2820 

ACAAGAATCG ATACATAAAT ATOGTTGCCT A2X3ATCATAG CAGGCTTAAG CTAGCACACC 2880 

TTGCIGAAAA GGATGGCAAA CTGACTGATT ATATCAATGC CAATTATGTT GATQGCTACA 2940 

ACAGACCAAA AGCTTATATT GCTGCCCAAG GCCCACTGAA ATCCACAGCT GA AGAT TTCT 3000 

GGAGAATGAT ATGGGAACAT AATGTGGAAG TTATTGTCAT C3ATAACAAAC CTOGTGGAGA 30 SO 

AAGGAAGGAG AAAATGTGAT CAGTACTGGC CTGCOGATGG GAGTGAGGAG TAOGGGAACT 3120 

TTCTGGTCAC TCAGAAGAGT GTGCAAGTGC TTGCCTATTA TACTGTGAGG AATTTTACTC 3180 

TAA6AAACAC AAAAATAAAA AAGGGCTCCC AGAAAGGAAG ACCCAGTGGA CXTTGTGGTCA 3240 

CACAGTATCA CTACAGGCAG TGGCCTGACA TGGGAGTACC AGAGTACTCC CTGCCAGTGC 3300 

TCACCTTTGT GAGAAAGGCA GCCTATGCCA AGCGCCATGC AGTCGGGCCT GTTGTO GTCC 3360 

ACTGCAGTGC TCGAGTTGGA AGAACAGGCA CATATATTGT 6CXAGACAGT ^GTTGCAGC 3420 

AGATTCAACA CGAAGGAACT GTCAACATAT TTGGCTTCTT AAAACACATC OGTTCACAAA 3480 

6AAATTATTT GGTACAAACT GAGGAGCAAT ATGTCTTCAT TCATGATACA CTGGTTGAGG 3540 

CCATACTTAG TAAAGAAACT GAGGTGCTGG ACAGTCATAT TCATGCCTAT GTTAATGCAC 3600 

TCCrCATTCC TGGACCAGCA GGCAAAACAA AGCTAGAGAA ACAATTCCAG CTGCT6AGCC 3660 

AGTCAAATAT ACAGCAGAGT GACTATTCXG CAGOOCTAAA GCAATGCAAC AGGGAAAAGA 3720 

ATOGAACTTC TTCTATCATC CCTGTGGAAA GATCAAGOGT TGGCATTTCA TCCCTGAGTG 3780 

GAGAAGGCAC AGACTACATC AATGCCTCCT ATATCATGGG CTATTACCAG AGCAATGAAT 3840 

.TCATCATTAC CCAGCACCCT CTCCTTCATA CCATCAAGGA TTTCTGGAGG ATGATATGGG 3900 

AOCATAATGC CCAACTGGTG GTTATGATTC CTGATGGCCA AAACATGGCA GAAGATGAAT 3960 

TTGTTTACTG GGCAAATAAA GATGAGCCTA TAAATTGTGA GAGCTTTAAG GTCACTCTTA 4020 

TGGCTGAAGA ACACAAATGT CTATCTAATG AGGAAAAACT TATAATTCAG GACTTTATCT 4080 

TAGAAGCTAC ACAGGATGAT TATGTACTTG AAGTGAGGCA CTTTCAGTGT CCTAAATGGC 4140 

CAAATCCAGA TAGCCCCATT AGTAAAACTT TTGAACTTAT AAGTGTTATA AAAGAAGAAG 4200 

CTGCCAATAG GGATGGGCCT ATGATTGTTC ATGATGAGCA TGGAG6AGTG AOSGCAGGAA 4260 

CTTTCTGTGC TCTGACAACC CTTATGCACC AACTAGAAAA AGAAAATTCC GTGGATGTTT 4320 

ACCAGGTAGC CAAGATGATC AATCTGATGA GGCCAGGAGT CTTTGCTGAC ATTGAGCAGT 4380 

ATCAGTTTCT CTACAAAGTG ATCCTCAGCC TTGTGAGCAC AAGGCAGGAA GAGAATCCAT 4440 

CCACCTCTCr GGACAGTAAT GGTGCAGCAT TGOCTGATGG AAATATAGCT GAGAGCTTAG 4500 

AGTCTTTAGT TTAACACAGA AAGGGGTGGG GGGACTCACA TCTGAGCATT GTTTTCCTCT 4560 

TCCTAAAATT AGGCAGGAAA ATCAGTCTAG TTCTGTTATC TGTTGATTTC CCATCACCTG 4620 

ACAGTAACTT TCATQACATA GGATTCTGCC GCCAAATTTA TATCATTAAC A ATGT GTGCC 4660 

TTTTTGCAAG ACTTGTAATT TACTTATTAT GTTTGAACTA AAATGATTGA ATTTTACAGT 4740 

ATTTCTAAGA ATGGAATTGT GGTATTTTTT TCTGTATTGA TTTTAACAGA AAATTTCAAT 4800 

TTATAGAGGT TAGGAATTCC AAACTACAGA AAATGTTTGT TTTTAGTGTC AAATTTTTAG 4860 

CTGTATTTGT AGCAATTATC AGGTTTGCTA GAAATATAAC TTTTAATACA GTAGCCTGTA 4920 

AATAAAACAC TCTTCCATAT GATATTCAAC ATTTTACAAC TGCAGTATTC ACCTAAAGTA 4980 

GAAATAATCT GTTACTTATT GTAAATACTG CCCTAGTGTC TCCATGGACC AAATTTATAT 5040 

TTATAATTGT AGATTTTTAT ATTTTACTAC TCPJSTCMiGT TTTCTAGTTC TGTGTAATTG 5100 

TTTAGTTTAA TGACGTAGTT CATTAGCTGG TCTTACTCTA CCAGTTTTCT GACATTGTAT 5160 

TGTGTTACCT AAGTCATTAA CTTTGTTTCA CCATGTAATT TTAACTTTTG TGGAAAATAG 5220 

AAATAOCTTC ATTTTGAAAG AAGTTTTTAT GAGAATAACA CCTTACCAAA CATTCyTTCAA 5280 

ATGGTTTTTA TCCAAGGAAT TGCAAAAATA AATATAAATA TTGCCATTAA AAAAAAAAAA 5340 
AAAAAAAAAA AAAAAAAAAA AAA 

Seq ID NO: 579 ProCein sequence: 
Protein Accession i: EOS sequence 

1 11 21 31 41 51 

I I I I 1 I 

MVFKASXITP HWGKCNMSSD G5EHSLEGQK FPLEMQIYCF DADRFSSFEE AVKGKGKLRA 60 

LSILFEVGTE ENLDPKAIID GVESVSRFGK QAALDPPILL NlxLPNSTDKY YIYNGSLTSP 120 

PCTDTVDWIV FKDTVSISES QLAVFCEV;.T. HQQSGYVMLM DYLQNNFREQ QYKPSRQVFS 180 

SYTGKEEIKE AVCSSEPENV QADPENYTSL LVTWERPRW YDTMIEKFAV LYQQLDGEDQ 240 

TKHEFLTDGY QDLGAIZjNNL LFNMSYVIiQI VAICTNGLYG KYSDQLIVDM PTDNPEUILP 300 

PELZGTEGII KECEEGKDZG EGAIVNPGRD SATNQIRKKE PQISTTTRYN RIGTKYNEAK 360 

TNR5PTRGSE PSGKGDVPNT SLNSTSQPVT KLATEKDISL TSC?TVTELPP HTVECTSASL 420 

NDGSKTVLRS PHMNLSGTAE SLNTVSITBY EEESLLTSPK LDTGAEDSSG SSPATSAIPP 480 

ISENISQGYI PSSEUPETIT YDVLIPESAR NASEDSTSSG SEESLKDPSM EGKVWPPSST 540 

DITAQP0VGS GRESFLQTNY TEIRVDESEK TTKSPSAGPV MSQGPSVTDL EMPHYSTPAY 600 

PPTEVTPHAP TPSSRQQDLV STVNWYSQT TQPVYNEASN SSHESRIGLA BGLESEKKAV 660 

IPLVIVSALT FICLWLVGI LIYWRKCFQT AHFYLEDSTS PRVISTPPTP IFPISDDVGA 720 

IPIXBPPKHV ADLHAS5GFT EEPETLXEFY QEVQSCTVDL GITADSSNHP DNKHXNRYZN 780 

IVAYOKSRVX LAQLAEKDGK LTDYZNANYV OGYNRPKAYI AAQGPLKSTA EDFHRMIWEH 840 

NVEVIVMITN LVEKGRRKCD QYWPADGSEE YGNPLVTQKS VQVLAYYTVR NPTLRNTKIK 900 

XGSQKGRPSG RWTQYHYTQ WPDKGVPEYS LPVLTFVRKA AYAKRHAVGP VWHCSAGVG S60 

RTGTYIVLDS MLQQIQHEGT VNIFGFLKKI RSQRNYLVQT .EEQYVFIHOT LVEAILSKST 1020 

EVLDSRIKAY VNALLIPGPA GKTKLEKQFQ LLSQSNIQQS DYSAALKQCK REKNRTSSII 1060 

PVERSRV6IS SLSGEGTDYI MASYIKGYYQ SNEFIITQHP LLETIKOFWR MIWDHNAQLV 1140 

VMIPDGQNMA EDEPVYWPNif DEPIKCESFK VTU4AEEHKC LSNEEKLIIQ DFILEATQDD 1200 

YVLEVRHFQC PKMPNPDSPI SKTFBLISVr KEEAANHDGP MIVHDEHGGV TAGTFCALTT 1260 

LMHQLEKEKS VDVYQVAKMI NLMRPGVFAD IEQYQFIjYKV ILSLVSTRQE ENPSTSLDSN 1320 
GAALPDGNIA ESLESLV 

Seq 10 KO: 580 DHA sequence 

Nucleic Acid Accession fi: EOS sequence 

Coding sequence: 148-4632 

1 11 21 31 41 51 
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CACACATACG CACGCACGAT CTCACTTOSA TCTATACACT GGAlGGATTAA AACAAACAAA 60 

C3UVAAAAAAC ATTTCCTTOG CTCXXTCCTCC CTCTCCACrC TGAGAAGCAG AGSAGGOGOV 120 

OGGCGAGGQG CCGCAGACCG TCTGGAAATG CGAATCCTAA AACGTTTCCT CGCTTCCATT 180 

CAGCTOCTCT CTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTOTTCAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGC5AAAG 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCftGGCTT GGGfllAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCA CTAA TGACTACOGT 480 

GTCAGOGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

AAATCCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGAtGCAAA TCTACTGCTT TGATGCGGAC OGATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 

GGAAAAGGGA ftOTTAAGAGC TTTATOCATT TTGTTTCAGG TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAG CGATTATOA TCGAGTCGAA AGTGTTACTC GTTrrCGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

ACAGTrACCA TCTCTGAAAG CCAGTTGGCT GTim iGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCIAGAC AGGTGTTTrC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTCTTACA 1140 

TGGGAAAGAC CTCGAGTCCT TTATGATACC ATGATTGAGA AGTTTGC3USr TTTGXACCAG 1200 

CAGTTCGATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTAOXAAT ATGAGTTATG TTCTTCAGAT AGTAGCXZATA 1320 

TCCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

AATCCIGAAC TTCATCTTTT CCCTGAATTA ATTOGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAQGGAA AAGACATTGA AGAAGGOGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TOGCATAGGG 1560 

ACGAAATACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCX^CTTOCX: AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

QAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTCTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCaCTTTCAA GCTTGATACT GGAGCT6AAG ATTCTTCAGG C TCCAG TCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGA6 AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCX: 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

GTGTCGTTTC CTAGCTCTAC AGACATAACA GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA CGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCPG CRGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTCCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATTTGGT CTCCACGGTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

GTATACAATG AGGCCAGTAA TAGTAGCCAT GAGTCTCGTA TTGGTCTAGC TGAG GGGTTG 2460 

GAATCCGAGA AGAAGGCAGT TATACCCCTT GTGATCGTGT CAGCCCTGAC TTTTAT CTGT 2520 

CTAGTGGTTC TIGTGGGTAT TCTCATCTAC TGGAGGAAAT GCTTCCAGAC TGCACACTTT 2S80 

TACTTAGAGG ACAGTACATC CCCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 2640 

ATTTCAGATG ATGTOSGAGC AATTCCAATA AAGCACTTTC CAAA GCATG T TGCAQATTTA 2700 

CATGCAAGTA GTGGGTTTAC TGAAGAATTT GAGACACTGA AAGAGTTTTA CCAGGAAGTG 2760 

CAGAGCTGTA CrGTTGACTT AGGTATTACA GCAGACAGCT CCAACX3VCCC AGACAACAAG 2820 

CACAAGAATC GATACATAAA TATCGTTGCC TATGATCATA GCAGGGTTAA GCTAGCACAG 2880 

CTTGCTGAAA AGGATGGCAA ACTGACTGAT TATATCAATG CCAATTATGT TGATGGCTAC 2940 

AACAGACCAA AAGCTTATAT TGCTGCCCAA GGCCCACTGA AATCCACAGC TGAAGATTTC 3000 

TCGAGAATGA TATGGGAACA TAAIGTGGAA GTTATTGTCA TGATAACAAA CCTCGTGGAG 3060 

AAAGGAAGGA GAAAATGTGA TCAGTACTGG CCTGCCGATG GGAGTGAGGA GTACGGGAAC 3120 

TTTCTCGTCA CTCAGAAGAG TGTGCAAGTG CTTGCCTATT ATACTGTGAG GAATTTTACT 3180 

CTAAGAAACA CAAAAATAAA AAAGGGCTCC CAGAAAGGAA GACCCAGTGG AOOTGTGGTC 3240 

ACACAGTATC ACTACACGCA GTGGCCTGAC ATGGGAGTAC CAGAGTACTC CCTGCCAGTG 3300 

CTCACCTTTG TGAGAAAGGC AGCCTATGCC AAGCGCCATG CAGTGGGGCC TGTTGTCGTC 3360 

CACTGCAGTG CTGGAGTTGG AAGAACAGGC ACATATATTG TGCTAGACAG TATGTTGCAG 3420 

CA6ATTCAAC ACGAAGGAAC TGTCAACATA TTTGGCTTCT TAAAACACAT CCGTTCACAA 3480 

AGAAATTATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA TTCATGATAC ACTGGTTGAG 3S40 

GCCATACTTA GTAAAGAAAC TGAGGTGCTG GACAGTCATA TTCATGCCTA TOTTAATGCA 3600 

CTCCTCATTC CTGGACCAGC AGGCAAAACA AAGCTAGAGA AACAATTCCA GGGTCTCACT 3660 

CTGTCACCCA GGCTGGAGTG CAGAGGCACA ATCTCGGCTC ACTGCAACCT TCCTCTCCCT 3720 

GGCTTAACTG ATCCTCCTAC CTCAGCCTCC CGAGTGGCTG GGACTATACT CCTGAGCCAG 3780 

TCAAATATAC AGCAGAGTGA CTATTCTGCA GCCCTAAAGC AATGCAACAG GGAAAAGAAT 3840 

OGAACTTCTT CTATCATCCC TGTGGAAAGA TCAAGGGTTG GCATTTCATC CCTGAGTGGA 3900 

GAAGGCACAG ACTACATCAA TGCCTCCTAT ATCATGGGCT ATTACCAGAG CAATGAATTC 3960 

ATCATTACCC A6CACCCTCT CCTTCATACC ATCAAGGATT TCTGGAGGAT GATATGGGAC 4020 

CATAATGCCC AACTGGTGGT TATGATTCCT GATGGCCAAA ACATGGCAGA AGATGAATTT 4080 

GTTTACTGGC CAAATAAAGA TGAGCCTATA AATTGTGAGA GCTTTAAGGT CACTCTTATG 4140 

GCTGAAGAAC ACAAATGTCT ATCTAATGAG GAAAAACTTA TAATTCAGGA CTTTATCTTA 4200 

GAAGCTACAC AGGATGATTA TGTACTTGAA GTGAGGCACT TTCAGTGTCC TAAATGGCCA 4260 

AATCCAGATA GCCCCATTAG TAAAACTTTT GAACTTATAA GTGTTATAAA AGAAGAAGCT 4320 

GCCAATAGGG ATGGGCCTAT GATTGTTCAT GATGAGCATG GAGGAGTGAC GGCAGGAACT 4380 

TTCTGTGCTC TGACAACCCT TATGCACCAA CTAGAAAAAG AAAATTCCGT GGATGTTTAC 4440 

CAGGTAGCCA AGATGATCAA TCTGATGAGG CCAGGAGTCT TTGCTGACAT TGAGCAGTAT 4500 

CAGTTTCTCT ACAAAGTGAT CXTTCAGCCTT GTGGGCACAA GGCACGAAGA GAATCCATCC 4560 

ACCTCTCCGG ACAGTAATGG TGCAGCATTG CCTGATGGAA ATATAGCTGA GAGCTTAGAG 4620 

TCTTTAGTTT AACACAGAAA GGGGTGGGGG GACTCACATC TGAGCATTGT TTTCCTCTTC 4680 

CTAAAATTAG GCAGGAAAAT CAGTCTAGTT CTGTTATCTG TTGATTTCCC ATCAOCTGAC 4740 

AGTAACTTTC ATGACATAGG ATTCTGCCGC CAAATTTATA TCATTAACAA TGTGTGCCTT 4800 

TTTGCAAQAC TTGTAATTTA CTTATTATGT TTGAACTAAA ATGATTGAAT TTTACAGTAT 4860 

TTCXAAGAAT GGAATTGTGG TATTTTTTTC TGTATTGATT TTAACAGAAA ATTTCAA TTT 4920 

ATAGAGGTTA GGAATTCCAA ACTACAGAAA ATGTTTGTTT TTAGTGTCAA ATTTTTAGCT 4980 

GTATTTGTAG CAATTATCAG GTTTGCTAGA AATATAACTT TTAATACAGT ACCCTGTAAA 5040 

TAAAACACTC TTCCATATGA TATTCAACAT TTTACAACTG CAGTATTCAC CTAAAGTAGA 5100 

AATAATCrCT TACTTATTGT AAATACXGCC CTAGTGTCPC CATGGACCftA ArTTATATTT S160 
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ATAATTGTAQ ATTTTTATAT TTTACTACTG AC7PCAAGTTT TCWGTTCTG TGTAATTGTT 5220 

TAGTTTAATG ACGTAGTTCA TTAGCTGGTC TTACTCTACC AGTTTTCTGA CATTGTATTG 5280 

TGTTACCTAA GTCATTAACT TTGTTTCAGC ATGTAATTrT AACTTTTGTG GAAAATAGAA 5340 

ATACCTTCAT TTTGJUVAGAA GTTTTTATGA GAATAACACC TTAOCAAACA TTGTTCAAAT 5400 

5 GGTTTTTATC CRAGGAATTG CftAAAATAAA TATAAATATT GCCATTAAAA AAAAAAAAAA S460 
AAAAAAAAAA AAAAAAAAAA A 

Seq ID NO: 581 Protein sequence: 
Protein Accession 8: BOS sequence 

1 11 21 31 41 51 

I I i I i i 

MRILKRFLAC IQLLCVCRLD KANGYYaQQR KLVEEIGMSY TGALHQXHNG KKYPTQISPK 60 

QSPIKIDEDL TQVNVNLKKL KFQGMDKTSL aiTFIHlJTGK TVEINLTNDY RV9GGVSSMV 120 

15 FKASKITFHW GKOJMSSDGS EHSLEGQKFP LEMQIYCFDA DSIFSSFEEAV KGKGKLRALS 180 

ILFEVGTEEN LDFKAIIDGV ESVSRPGKQA AIiDPFILUIL LPNSTDKYYI YNGSLTSPPC 240 

TDTVDWIVFK DTVSISESQL AVFCEVLTKQ QSGYVMLMDY LQNNFREQQY KFSRQVFSSY 300 

TCKEBIHEAV CS5EPEMVQA DPENYTSLLV TWERPRWYD TMIEKFAVLY QQLDGEDQTK 360 

HEFLTDGYQD LGAIUJNLLP NMSYVLQIVA ICTNGLYGKY S DQLIV DWPT DNPELDLPPE 420 

20 LIGTEEIIKB EEBGICDIBEG AIVHPGSDSA TNQlRiOCSPQ ISTTTHYKRI GTKYNEAKTM 480 

RSPTRGSEPS GKGDVPNTSL NSTSQPVTKL ATEKDISLTS QTVTBIiPPHT VEXTTSASXjKD S40 

GSKTVLRSPH MJILSGTAESL NTVSITBYEE ESLLTSFiCLD TGAEDSSGSS PATSAIPPIS 600 

ENISQGYIFS SENPETITYD VLIPESARKA SEDSTSSGSB ESLKDPSMBG NVWPPSSTDI 660 

TAQPDVGSGR ESFLQTNYTE IRVDESEKTT KSFSAGPVMS QGPSVTDIiEM PHYSTFAYFP 720 

25 TEVTPHAPTP SSRQQDLVST VNWYSQTTQ PVYNBASNSS HESRIGLAEG LESEKKAVIP 780 

LVIVSALTFI CLWLVGIIiZ YWRKCFQTAH FYLEDSTSPR VISTPPTPIP PISDDVGAIP 840 

■IKHFPKHVAO LHASSGPTEB PETLKEFYQE VQSCTVDLGI TADSSNHPDII XHKNRYIMIV . 900 

AYDKSRVKLA QLAEKDGKLT DYINANYVDG YNRPKAYIAA QGPLKSTAED FWRMIWEHNV 960 

EVIVMITNLV EKGRRKCDQY WPADGSEEYG NFLVTQKSVO VLAYYTVRNF TLRNTKIKKG 1020 

30 SQKGRPSGRV VTQYHYTQWP DMGVPEYSLP VLTFVRKAAY AKRHAVGPW VHCSAGVGRT 1080 

GTYIVU)SML QQIQHEGTVII IPGFLKHIRS QHNYLVQTEE QYVFIHDTLV EAILSKETBV 1140 

IjDSHIHAYVM AIiLIPGPAGlC TKLEKQPQGL TLSPRIiECRG TISAHOJLPL PGLTDPPTSA 1200 

SRVAGTILLS QSNIQQSDYS AALKQCNREK NRTSSIIPVE RSRVGISSLS GEGTDYIKAS 1260 

YIMGYYQSNE FIITOHPLLH TIKDFWRMIW DHNAQliWMI PDGQKMAH3E FVYWPNKDEP 1320 

35 INCBSFKVn. MAEEHKCLSN EEKLIIQDFI LEATQDDYVI* EVSHFQCPKW PNPDSPISKT 1380 

FELXSVIKBE AANRDGPMIV KDSIGGVTAG TFCALTTLMH QLEKQISVDV YQVAKMINLM 1440 
RPGVFADIEQ YQFLYKVIXiS ZjVGTRQEENP STSU35M6AA LPOGMIAESL ESLV 



40 Seq ID NO: 582 DNA sequence 

Nucleic Acid Accession #: MM_002BS1.1 
Coding sequence : 148 . . 7092 

1 11 21 31 41 51 

4S I I I I I I 

CACACATACG CAOGCACGAT CTCACTTOGA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTC3G CTCCCCCTCC CTCTCCACTC TGAGAAGCAG A6GAGC0GCA 120 

CGGCGAGGGG CCGCAGACCG TCTGGAAATG OGAATCCTAA AGOSTTTOCT CGCTTGCATT 180 

CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

50 CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACCGT 480 

GTCAGOGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

55 AAATGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGATGCGGAC CGATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAjG CGATTATTGA TGGA6TCGAA AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

60 AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CaGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

65 TGGGAAAGAC CTOGAGTCGT TTAIGATACC ATGATTGAGA AGTTT6CAGT TTTGTAOCAS 1200 

CAGTTGGATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAOACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

70 GAAGAOGGAA AAGACATTGA AGAAGGCGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TCGCATAGGG 1560 

ACGAAAtACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TOCACTTOCC AACCAGTCRC TAAATTAGOC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACX: TCACACTGTG 1740 

75 GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCC3CATT CATCTCTGAO AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAQ AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

80 GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

GTGTGGTTTC CTAGCTCTAC AGAC31TAACA GCACAGCXTCG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA CGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTGATGCTTT TACCCCATCC 2340 

85 TCCAGACAAC AGGATTTGGT CTCCAOGGTC AAOGTGGTAT ACT06CAGAC AACCCAACCG 2400 

GTATACAATG GTGAGACACC TCTTCAACCT TCCTACAGTA GTGAAGTCTT TOCTCTAGTC 2460 

ACCCCTTTGT TGCTTGACAA TCAGATCCTC AACACTACCC CTGCTCCTTC AAGTAGTGAT 2520 
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TOGGCCTTGC ATGCTAOGCC TGTATTTCCC AGTGTCGATG TGrCAITTGA ATCCATCCTG 2580 

TCTTCCTATG ATGGTGCACC TTTGCTTOCA 1 n' l ' CC ' iCll» CTTCCTTCAG TAGTGAATTG 2640 

TTTOGCCATC TGCATACAGT TTCTCAAATC CTTCCACAAG TTACTTCftGC TACOGAGAGT 2700 

GATAAGGTGC CCTTGCATGC TTCTCTGCCA GTGGCTGGGG GTGATTTGCT ATTAGAGCCC 2760 

5 AGCCTTGCTC AGT ATTCTGA T GTGC TGTCC ACTACTCATG CTGCTTCAGA GAOGCTGGAA 2820 

TTTGGTAGTG AATCTGGTGT TCTTTATAAA AOGCTTATGT TTTCTCAilGT TGAAOCAOCC 2880 

AGCAGTGATG CCATGATGCA TGCAOGTTCT TCAGGGCCTG AACCTTCTTA TGCCTTGTCT 2940 

QATAATGAGG GCTCOCAACA CATCTTCACT GTTTCTTACA GTTCTGCAAT ACCTCTGCAT 3000 

GATTCTGTGG GTGTAACTTA TCAGGGTTCC TTATTTACCG GCCCTAGCCA TATACCAATA 3060 

10 CCTAAGTCTT OSTTAATAAC CCCAACTGCA TCATEACTGC AGCCTACTCA TGCCCTCTCT 3120 

GGTGATGGGG A A TGGTCTGG AGCCTCrTCT GATAGTGAAT TrCTTTTAOC TGACACAGAT 3180 

GGGCTGACAG OCCTTAACAT TTCTTCACCT GTTTCTGTAG CTGAATTTAC ATATACAACA 3240 

TCTGTGTTTG GTGATGATAA TAAGGCGCTT TCTAAAAGTG AAATAATATA TGGAAATGAG 3300 

ACTGAACTGC AAATTCCTTC TTTCAATGAG ATGGTTTACX: CTTCTGAAAG CACAGTCATG 3360 

15 CCCAACATGT ATGATAATGT AAATAAGTTG AATGCGTCTT TACAAGAAAC CTCTGTTTCC 3420 

ATTTCTAGCA GCAAGGGCAT GTTTCCAGGG TCCCTTGCTC ATACCACCAC TAAGGTTTTT 3480 

GATCATGAGA TTAGTCAAGT TCCAGAAAAT AACTTTTCAG TTCAACCTAC ACATACTGTC 3 540 

TCTCAAGC3^T CTGGTGACAC TTCGCTTAAA CCTGTGCTTA GTGCAAACTC AGAGCCAGCA 3600 

TCCTCTGACC CTGCTTCTAG TGAAATGTTA TCTCCTTCAA CTCAGCTCTT ATTTTATGAG 3660 

20 ACCTCAGCrr CTTTTAGTAC TGAAGTATTG CTACAACCTT OCTTTCAGGC TTCPGATGTT 3720 

GACACCTTGC TTAAAACTGT TCTTCCAGCT GTGCCCAGTG ATCCAATATT GGTTGAAACC 3700 

CCCAAAGTTG ATAAAATTAG TTCTACAATG TTGCATCTCA TTGTATCAAA TTCTGCTTCA 3840 

AGTGAAAACA TGCTGCACTC TACATCIGTA CCAGTTTTTG ATGTGTG6CC TACTTCTCAT 3900 

ATGCACTCTG CTTCACTTCA AGGTTTX3ACC ATTTGCTATG CAAGTGAGAA ATATGAACGA 3960 

25 6TTTT6TTAA AAAGTGAAAG TTGCCACCAA GTGGTACCTT CTTTGTACAG TAATGATGAG 4020 

TTGTTCCAAA OGGCCAATTT GGAGATTAAC CAGGCCCATC COCCAAAAGG AAGGCATGTA 4080 

TTTGCTACAC CTGTTTTATC AATTGATGAA CCATTAAATA CACTAATAAA TAAGCTTATA 4140 

CATTCaSATG AAATTTTAAC CTCCACXJ^ AGTTCTGTTA CTGGTAACSGT ATTTGCTGG7 4200 

ATTCCAACAG TTGCTTCTGA TACATTTGTA TCTACTGATC ATTCTGTTCC TATAGGAAAT 4260 

30 GGGCATGTTG CCATTACA6C TGTTTCTCCC CACAGAGATG GTTCTGTAAC CTCAACAAAG 4320 

TTGCTGTTTC CTTCTAAGGC AACTTCTGAG CTGAGTCATA GTGCCAAATC TGATGCOGGT 4380 

TTAGIGGGT6 GTGGTGAAGA TGGTGACACT GATGATGATG GTGATGATGA TGATGACAGA 4440 

GATAGTGATG GCTTATCCAT TCATAAGTGT ATGTCATGCT CATCCTATAG AGAATCACAG 4500 

^ GAAAAGGTAA TGAATGATTC AGACACCCAC GAAAACAGTC TTATGGATCA GAATAATCCA 4560 

35 ATCTCATACT CACTATCTGA GAATTCTGAA GAAGATAATA GAGTCACAAG TGTATCCTCA 4620 

GACAGTCAAA CTGGTATGGA CAGAAGTCCT GGTAAATCAC CATCACCAAA TGGGCTATCC 4680 

CAAAA6CACA ATGATGGAAA AGAGGAAAAT GACATTCAGA CIGGTAGTGC TCT6CTTCCT 4740 

CTCAGCCCTG AATCTAAA6C AT6GGCAGTT CTGACAAGTG ATGAA6AAAG TGGATCAGGG 4800 

CAAGGTACCT CAGATAGCCT TAATGAGAAT GAGACTTCCA CAGATTTCAG TTTTGCAGAC 4860 

40 ACTAATGAAA AAGATGCTGA TGGGATCCTG GCAGCAGGTG ACTCAGAAAT AACTCCTGGA 4920 

TTCCCACAGT CCCCAACATC ATCTGTTACT AGCGAGAACT CAGAAGTGTT CCACGTTTCA 4980 

GAGGCAGAGG CCAGTAATAG TAGCCATGAG TCTCGTATTG GTCTAGCTGA GGGGTTGGAA 5040 

TCCGAGAAGA AGGCAGTTAT ACCCCTTGTG ATCGTGTCAG CCCTGACTTT TATCTGTCTA 5100 

. GTGGTTCTT6 TGGGTATTCT CATCTACTGG AGGAAATGCT TCCAGACTGC ACACTTTTAC 5160 

45 TTAGAGGACA GTACATCCCX: TAGAQTTATA TCCACACCTC CAACACCTAT CTTTCCAATT 5220 

TCAGATGATG TCGGAGCAAT TCCAATAAAG CACTTTCCAA AGCATGTTGC AGATTTACAT 5280 

GCAAGTAGTG GGTTTACTGA AGAATTTGAG ACACTGAAAG AGTTTTACCA GGAAGTGCAG 5340 

AGCTGTACTG TTGACTTAGG TATTACAGCA GACAGCTCCA ACCACCCAGA CAACAAGCAC 5400 

AAGAATCGAT ACATAAATAT OGTTGCCTAT GATCATAGCA GGGTTAAGCT AGCACAGCTT 5460 

50 GCItSAAAAGG ATGGCAAACT GACIGATTAT ATCAATGCCA ATTATGTTGA TGGCTACAAC 5520 

AGACCAAAAG CTTATATTGC TGCXX:AAGGC CCACTGAAAT CCACAGCTGA AGATTTCTGG 5580 

AGAATGATAT GGGAACATAA TGTGGAAGTT ATTGTCATGA TAACAAACCT CGTGGAGAAA 5640 

GGAAGGAGAA AATGTGATCA GTACTGGCCT GCCGATGGGA GTGAGGAGTA CGGGAACTTT 5700 

CTGGTCACTC AQAAGAGTGT GCAAGTGCTT GCCTATTATA CTGTGAGGAA TTTTACTCTA 5760 

55 AGAAACACAA AAATAAAAAA GGGCTCCCAG AAAGGAAGAC CCAGTGGACG TGTGGTCACA 5820 

CAGTATCACT ACACGCAGTG GCCTGACATG GGAGTACCAG AGTACTCCCT GCCAGTGCTG 5880 

ACCTTTGTGA GAAAGGCAGC CTATGCCAAG CGCCATGCAG TGGGGCCTGT TGTCGTCCAC S940 

TGCAGTGCTG GAGTTGGAAG AACA6GCACA TATATTGTGC TAGACAGTAT GTTGCAGCAG 6000 

ATTCAACACG AAGGAACTGT CAACATATTT GGCTTCTTAA AACACATCCG TTCACAAAGA 6060 

60 AATTATTTGG TACAAACTGA GGAGCAATAT GTCTTCATTC ATGATACACT GGTTGAGGCC 6120 

ATACTTAGTA AAGAAACTGA GGTGCTGGAC AGTCATATTC ATGCCTATGT TAATGCACTC 6180 

CTCATTCCTG GACCAGCAGG CAAAACAAAG CTAGAGAAAC AATTCCAGCT CCTGAGCCAG 6240 

TCAAATATAC A6CAGAGTGA CTATTCTGCA GCCCTAAAGC AATGCAACAG G6AAAAGAAT 6300 

^ GGAACTTCTT CTATCATCCC TGTGGAAAGA TCAAGGGTTG GCATTTCATC CCTGAGTGGA 6360 

65 GAAGGCACAG ACTACATCAA TGCCTCCTAT ATCATGGGCT ATTACCAGAG CAATGAATTC 6420 

ATCATTACCC AGCACCCTCT CCTTCATACC ATCAAGGATT TCTGGAGGAT GATATGGGAC 6480 

CATAATGCCC AACTGGTGGT TATGATTCCT GATGGCCAAA ACATGGCAGA AGATGAATTT 6540 

GTTTACTGGC CAAATAAAGA TGAGCCTATA AATTGTGAGA GCTTTAAGGT CACTCTTATG 6600 

GCTGAAGAAC ACAAATGTCT ATCTAATGAG GAAAAACTTA TAATTCAGGA CTTTATCTTA 6660 

70 GAAGCTACAC AGGATGATTA tGTACTTGAA GTGAGGCACT TTCAGTGTCC TAAATGGCCA 6720 

AATCCAGATA GCCCCATTAG TAAAACTTTT GAACTTATAA GTGTTATAAA AGAAGAAGCT 6760 

GCCAATAGGG ATGGGCCTAT GATTGTTCAT GATGAGCAT6 GAGGAGTGAC GGCAG6AACT 6840 

TTCTGTGCTC TGACAACCCT TATGCACCAA CTAGAAAAAG AAAATTCCGT GGATGTTTAC 6900 

CAGGTAGCCA AGATGATCAA TCTGATGAGG CCAGGAGTCT TTGCTGACAT TGAGCAGTAT 6960 

75 CAGTTTCTCT ACAAAGTGAT CCTCAGCCTT GTGAGCACAA GGCAGGAA6A GAATCCATCC 7020 

AOCTCTCTGG ACAGTAATGG TGCAGCATTG CCTGATGGAA ATATAGCTGA GAGCTTAGAG 7080 

TCTTTAGTTT AACACAGAAA GGGGT6GGGG GACTCACATC TGAGCATTGT ■ m X X 'I'en' C 7140 

CTAAAATTAG GCAGGAAAAT CA6TCTAGTT CTGTTATCTG TTGATTTCCC ATCACCTGAC 7200 

AGTAACTTTC ATGACATAGG ATTCTGCCGC CAAATTTATA TCATTAACAA TGTGTGOCTT 7260 

80 TTTGCAAGAC TTGTAATTTA CTTATTATGT TTGAACTAAA ATGATTGAAT TTTACAGTAT 7320 

TTCTAAGAAT GGAATTGTGG TATTTTTTTC TGTATTGATT TTAACAGAAA ATTTCAATTT 7380 

ATAGAGGTTA GGAATTCCAA ACTACAGAAA ATGTTTGTTT TTAGTGTCAA ATTTTTAGCT 7440 

GTATTTGTAG CAATTATCAG GTTTGCTAGA AATATAACTT TTAATACAGT AGOCTGTAAA 7500 

TAAAACACTC TTCCATATGA TATTCAACAT TTTACAACTG CAGTATTCAC CTAAAGTAGA 7S60 

o5 AATAATCTGT TACTTATTGT AAATACTGGC CTAGTGTCTC CATGGACCAA ATTTATATTT 7620 

ATAATTGTAG ATTTTTATAT TTTACTACT6 AGTCAAGTTT TCTAGTTCTG TGTAATTGTT 7680 

TAGTTTAATG ACGTAGTTCA TTAGCPGGTC TTACTCTACC ACTTTTCTGA CATTGTATTG 7740 
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TGTTACCTAA GTCATTAACT TTGTTTCACC ATGTAATTTT AACTTTTGTG GAAAATAGAA 7800 
ATACCTTOVr TTTCAAASAA GTTTTTATGA GAATAAOVCC TTACCKAAQV TTGTTCMAT 7860 
eSITTTTKIC CAAOBAATTG OMAAATAAA TATAAATATT GCCATTAAAA AAAAAAAAAA 7930 
AAAAMAAAA AAAAAAAAAA A 
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Seq ID NO: 583 Protein sequence 
Protein Accession §: NP_002 842.1 



1 
I 

MRILXRFLAC 
QSPIKIDEDL 
PKASKITFHW 
2LFEVGTE3J 
TDTVDWIVFK 
TGKEBXBEAV 
HEFLTOGYQD 
LIGTEEIIXB 
RSPTRGSEFS 
GSKTVLRSPH 
ENISQGYIFS 
TAQPDVGSGR 
TBVTPHAFTP 
LNTTPAASSS 
ZLPQVTSATB 
KTLMFSQVSP 
SLFSGPSHIP 
PVSVABFTYT 
UIASZ.QBTSV 
KPVLSAKSEP 
AVPSDPILVE 
TISYASEKYE 
EPLNTLINKL 
PHRDGSVTST 
CMSCSSYRES 
PGKSPSANGXi 
NETSTDFSFA 
ESRIGLAEGL 
ISTPPTPIFP 
ADSSNKPDNK 
GPLKSTAEDP 
LAYYTVWTFT 
KRHAVGFVW 
YVFIHDTLVE 
AALKQCNHEK 
TIKDFWRMIW 
EEKLIIQDFI 
HDEHGGVTAG 
LVSTRQEENP 



11 
I 

IQLLCVCRU) 
TQVNVmjKKL 
GKCKMSSDGS 
LD?iCAIir>CV 
DTVSISBSQL 



LGAILNNLLP 
EEEGKDIEEG 
GKGDVPNTSI* 
MtlLSGTAESL 
SBNPETITYD 
ESFLQTNYTE 
SSRQQDLVST 
DSALBATPVF 
SDKVPZjHASL 
PSSDAMMKAR 
IPKSSLITPT 
T5VFGDDNXA 
SISSTIK5MFP 
A5SDPASSEM 
TPKVDKISST 
PVLLKSESSH 
IHSDEILTST 
KLLFPSKATS 
QEKVMNDSDT 



DTNEKDADGX 
£S£KKAVIPL 
ISDDVGAIPI 
HKNRYINIVA 
WRMIHEHKVE 
LRNTKIKiGGS 
HCSAGVGRTG 
AILSKETEVL 
NRTSSIIPVE 
DHNAQLWMI 
liEATQDDYVL 
TFCALTTLMH 
STSLDSNGAA 



21 

HAKGYVkQQR 
KFQGWDKTSL 
EaiSLEGQKFP 
ESVSRFGKOA 
AVFCEVIiTKQ 
DPEKYTSLLV 
NMSYVLQIVA 
AIVNPGRDSA 
NSTSQPVTKL 
NTVSITEYBE 
VLIPE5ARKA 
IRVDESEKTT 
VNWYSQTTQ 
PSVDVSPESI 
PVAGGDLLLE 
SSGPBPSYAL 
ASLLQPTHAL 
LSRSEIiyGN 
GSIiAHTTTKV 
XjSPSTQLLFY 
MLHLIVSNSA 
QWPSLYSND 
KSSVTGKVFA 
ELSHSAKSDA 
HENSLMDQNN 
NDIQTGSALL 
lAAGOSEITP 
VIVSALTPIC 
KHFPKHVADL 
YDHSRVKIiAQ 
VIVMITNbVE 
QKGRPSGRW 
TYIVUISMLQ 
DSHIHAYVNA 
RSRVGISSLS 
PDGQNMAEDE 
EVRHFOCPKW 
QLEKENSVDV 
LPOGNIACSL 



31 
i 

KLVEEIGWSy 
ENTFIHNTGX 

LEMQIYCFDA 
ALOPFILLNL 
QSGYVMLHDY 
THERPRWYD 
ICmCtiYGKY 
THQIRKKEPQ 
ATEKDISLTS 
ESIiLTSFKLD 
SEDSTSSGSB 
KSPSAGPVMS 
PVYNGSTPLQ 
LSSYDGAPLIi 
PSLAQYSDVL 
SONEGSQHIF 
SGDGEWSGAS 
ETELQIPSFN 
FDKEISQVPE 
BTSASFSTEV 
SSEimiiHSTS 
ELFQTANLEI 
GIPTVASDTF 
GliVGGGEDGD 
PISYSLSENS 
PLSPSSKAWA 
GFPQSPTSSV 
LWLVGILIY 
HASSGFTEEF 
LAEKDGKLTD 
KGRRKCDQYW 
KJYHYTQWPD 
QIQHEGTVNI 
LLIPGPAGKT 
GEGTDYINAS 
FVYWPNKDEP 
PIJPDSPISKT 
YQVAKMINIjM 
ESLV 



41 
I 

TGALNQKNWG 
TVEINLTNOY 
DRFSSFEEAV 
LPNSTDKYYI 
LQJOIFRBQQY 
TMZBKFAVLY 
SOQLIVDMFT 
ISTTTHYNRI 
OTVTELPPHT 
TGAEOSSGSS 
ESLKDPSMEG 
QGPSVTDLEM 
PSYSSEVFPL 
PFSSASFSSB 
STTHAASETL 
TVSYSSAIPV 
SDSEFLLPDT 
EMVYPSBSTV 
NNFSVQPTHT 
LLQPSFQASD 
VPVFDVSPTS 
KQAHPPKGRH 
VSTDHSVPIG 
TDDDGDDDDD 
EEDajRVTSVS 
VLTSDEESGS 
TSEHSEVPHV 
WRKCFQTAHP 
ETLKEFYQEV 
YINANYVDGY 
PADGSECYGN 
MGVPEYSLPV 
PGFLKHIRSQ 
KLEKQFQLIiS 
YIKGYYQSNE 
INCESFKVTL 
FELISVIKEB 
RPGVPADIEQ 



51 
I 

tOCYPTCMSPK 
RVSGGVSEMV 
KGKGXLRALS 
YKGSLTSPPC 
KPSRQVFSSY 
QQLOGEDQTK 



GTKYNEAKTN 
VEGTSASIJJD 
PATSAIPFIS 
NVWFPSSTOI 
PHYSTPAYFP 
VTPLUJ3NQZ 
LPRHLHTVSQ 
EFG5ESGVLY 
HDSVGVTYQG 
DGLTALNISS 
MPNHYDNVNK 
VSQASGDTSL 
VDTLLKTVLP 
HMHSASIiQGL 
VFATPVLSID 
NGHVAITAVS 
ROSDGLSIKK 
SOSQTGMDRS 
GQGTSDSUIE 



YLEDSTSPRV 
QSCTVDLGIT 
NRPKAYIAAQ 
PLVTQKSVQV 
LTFVRKAAYA 
RNYLVQTEEQ 
QSNIQQSDYS 
FIITQHPLLH 
MAEERKCLSN 
MNSXXSPiflV 
YQFLYKVILS 



Seq ID NO: 584 DMA sequence 

nucleic Acid Accession il: NM_00S688.1 

Coding sequence: 126.. 4439 



1 
I 

CCGGGCAGGT 
AGGGGC3GCAG 
AGAAGATGAA 
GTGTGAGGGA 
GGAGAACTCG 
TCTCTCTTGA 
GAAAGTACCA 
ACCCAGTGGA 
CCOGTGTGGC 
ACGAGTCTTC 
AAGTTGGGCC 
TCATCCTGTC 
TCATGGTGAA 
TGTTGTrAGT 
CTTGGGCATT 
TTAAGAAGAT 
TTTGCTCCAA 
GACGACCCGT 
GCTTCCTGGG 
TCACA6CATA 
ATGAAGTTCT 
AGAGTGTTCA 
AGGGTATCAC 
CTGTTCATAT 
TCTTCAATTC 
AAGCCTCAGT 
TAAAGAACAA 
GGGACTCCTC 
ACAAGA6GGC 
AGGOGGTGCT 
CCGAAGAGGA 



11 
I 

GGCTCATGCT 
GAATTCTGAT 
GGATATCGAC 
GAGAACXS^GC 
ACOGTTGGAA 
TGCCTCCATG 
TCATGGCTTG 
CAATGCTGGG 
CCACAAGAAG 
TGACGTGAAC 
AGACGCTGCT 
CATOST6T6C 
ACACCTCTTG 
6CTGGGCCTC 
GAATTACCGA 
CCTTAAGTTA 
CGATGGGCAG 

tgttgccatc 
atcagctgtt 

TTTCAGGAOA 
tACTTACATT 
AAAAATCCGC 
TGTGGGTGTG 
GACCCTGGGC 
CATGACTTTT 
GGCTGTTGAC 
ACCAGCCAGT 
CCACTCCAGT 
TTCCAGGGGC 
G6CA6AGCAG 
AGAAiOCSCAAG 



21 
I 

CGGGAGC6TG 
GTGAAACTAA 
ATAGGAAAAG 
ACTTCTGGGA 
TGCCAAGATG 
CATTCTCAGC 
AGTGCTCTGA 
CTTTTTTCCT 
GGGGAGCTCT 
TGCAGAAGAC 
TCCCTGCGAA 
CTGATGATCA 
GAGTATACCC 
CTCCTGAOGG 
ACCGGTGTCC 
AAGAACATTA 
AGAATGTTTG 
TTAGGCATGA 
TTTATCCrCT 
AAATGGGTGG 
AAATTTATCA 
GAGGAGGAGC 
GCTCCCATTG 
TTCGATCTGA 
GCTTTGAAAG 
AGATTTAAGA 
CCTCACATCA 
ATCCAGAACT 
AAGAAAGAGA 
AAAGGCCACC 
CACATCCAOC 



31 
1 

GTTGAGCGGC 
CAGTCTGTGA 
AGTATATCAT 
CGCACAGAGA 
CCTTGGAAAC 
TCAGAATCCT 
AGCCCATCCG 
GTATGACTTT 
CAATGGAAGA 
TAGAGAGACT 
GGGTTGTGTG 
0GCA6CTGGC 
AGGCAACA6A 
AAATCGTGCG 
GCTTGCGGGG 
AAGAGAAATC 
AGGCAGCAGC 
TTTATAATGT 
TTTACCCAGC 
COGCCAOGGA 
AAATGTATGC 
GTCGGATATT 
TGGTGGTGAT 
CAGCAGCACA 
TAACACOGTT 
GT'lT G rri'CT 
AGATAGAGAT 
CGCCCAAGCT 
AGGTGAGGCA 
TCCTCCTGGA 
TGGGOCaOCT 




GGCCATCCTA 
CCTGGGTGAG 
CGTTGGCAGC 
AATTATTCTG 
AATGATGTXT 
TGAAOGTGTC 
CTGGGTCAAA 
GGAAAAAGCC 
TGCCAGCGTG 
GGCTTTCACA 
TTCAGTAAAG 
AATGGAAGAG 
GAAAAATGCC 
GACCCCCAAA 
GCTGCAGCGC 
CAGTGAOGAO 
GOGCTTACAG 



51 
1 

GTCCTGGAGC 
CTCOGCTCAG 
GGGTATAGAA 
TCCAAGTTCA 
GCCGAGGGCC 
CATGCCAAGG 
AAACACCAGC 
TCTTCTCTGG 
CTGTCCAAGC 
GAGCTGAATG 
OGCACCAGGC 
GGAOCAGCCT 
CAGTACAGCT 
CTTGCACTGA 
ACCATGGCAT 
CTCATCAACA 
CTGCTGGCTG 
GGACCAACAG 
GCATCACGGC 
CA GAAOAT GA 
GCATTTTCTC 
GGGTACTTCC 
GTGACCTTCT 
GTGGTGACAG 
TOCCTCTCAG 
GTTCACATGA 
ACCTTGGCAT 
ATGAAAAAAG 
ACTQAGCATC 
CGGCCCAGTC 
AGGACACTGC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1060 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1060 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 

laoo 

1660 
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ACAGCATCGA TCTCGAGATC CAAGAGGGTA AACTGGTTGG AATCTGOSGC ACTGTGGGAA 1920 

GTGGAAAAAC CTCTCTCATT TCAGCCATTT TAGGCCAGAT GAOGCTTCTA GAGGGCAGCA 19S0 

TTGCAATCAG TGGAACCTTC GCTTATGTGG CCCACCAGGC CTGGATCCTC AATCCTACTC 2040 

TGAGAGACftA. CATOCTGTTT GGGAAGGAAT ATGATGAAGA AAGATACAAC TCTGTGCTGA 2100 

ACAGCTCCTG CCTGAGGCCT GACCTGGCCA TTCTTCCCAG CAGCGACCTG AC5GGAGATTG 2160 

GAGAGOSAGG AGCC3UUXTG AGCGGTGGGC AGCGCCAGAG GATCAGCCTT GCCOGGGCCT 2220 

TGTATAGTGA CAGGAGCATC TACATCCTGG AOGAOOCCCT CAGTGCCTTA GATGCTCATG 2280 

TGGGCAACCA CATXTTTCAAT AGTGCTATCC GGAAACATCT CAAGTCCAAG ACAGrrCTGT 2340 

TTGTTACCCA CCAGTTACAG TACCTGGTTG ACPGTGATGA AGTGATCTTC ATGAAAGAGG 2400 

GCTGTATTAC GGAAAGAGGC ACCCA7GAGG AACTGATGAA TTTAAATGGT GACTATGCTA 2460 

CXSVrrTTT A A TAACCT G TT G CIGGGAGAGA CAOGGCCAGT TGAGATCAAT TCAAAAAAGG 2520 

AAACCAGTGG TFCACAGAAG AAGTCACAAG ACAAGGGTCC TAAAACAG6A TCAGTAAAGA 2580 

AGGAAAAAGC AGTAAAGCCA GAGGAAGGGC AGCrTGTGCA GCTGGAAGAG AAAGGGCAGG 2640 

GTTCAGTGCC CTGGTCAGTA TATGGTGTCT ACATCCAGGC TGCTGGGGGC CCCTTGGCAT 2700 

TCCTGGTTAT TAtGGCCCTT TTCATGCTGA ATGTAGGCAG CACXXjOCTTC AGCACCTGGT 2760 

GGTTGAGTTA CTGGATCAAG CAAGGAAGOG QGAACACCAC TGTGACTCGA GGGAAOGAGA 2620 

CCTCG6T6AG TGACAGCATG AAGGACAATC CTCATATGCA GTACTATGCC AGCATCTACG 2880 

CCCTCTCCAT GGCAGTCATG CTGATCCTGA AAGCCATTOG AGGAGTTGTC TTTGTCAAGG 2940 

GCACGCTGCG AGCTTCCTCC CGGCTCCATG AOGAGCTTTT CCGAAGGATC CTTGGAAGCC 3000 

CTATGAAGTr TTTTGACACG ACCCCCACAG QGAGGATTCT CAACAGGTTT TCCAAAGACA 3060 

TGGATGAAGT TGACGTGCGG CTGCCGTTCC AGGCOGAGAT GTTCATCCAG AAOGTTATCC 3120 

TGGTGTTCTT CTCTGTGGGA ATGATCGCAG GAGTCTTCCC GTGGTTCCrT GTGGCAGTGG 3180 

GGCCCCTTCT CATCCTCTTT TCAGTCCTGC ACATTGTCTC CAGGGTCCTG ATTOQGGAGC 3240 

TGAAGOGTCT GGACAATATC AOGCAGTCAC CTTTCCTCTC CCACATCACG TCCAGCATAC 3300 

AGGGCCTTGC CAOCATOCAC GCCTACAATA AAOGGCAGGA GTTTCTGCAC AGATACCAGG 3360 

AGCIGCTGGA TGACAACX3\A GCTCCTTTTT TTTTGTTTAC GTGTGOGATG CGGTGGCTGG 3420 

CTGTGCGGCT GGACCTCATC AGCATCGCCC TCATCACCAC CACGGGGCTG ATGATCGTTC 3480 

TTAIGCACGG GCAGATTCCC CCAGCCTATG CGGGTCTCGC CATCTCTTAT GCTGTCCAGT 3540 

TAAOGGGGCT GTTCCAGTTT AOGGTCAGAC TGGCATCTGA GACAGAAGCT CGATTCACCT 3600 

OGGTGGAGAG 6ATCAATCAC TACATTAAGA CTCTGTCCTT GGAAGCACCT GCCAGAATTA 3660 

AGAACAAGGC TCXCTCCCCT GACTGGCCCC AGGAGGGAGA GGTGACCTTT GAGAAOGCAG 3720 

AGATGAGGTA CCX5AGAAAAC CTCCCTCTTG TCCTAAAGAA AGTATOCTTC ACGATCAAAC 3780 

CTAAAGAGAA GATTGGCATT GTGGGGCGGA CAGGATCAGG GAAGTCCTCG CTGGGGATGG 3840 

CCCTCTTCCG TCTGGTGGAG TTATCTGGAG GCTGCATCAA GATTGATGGA GTGAGAATCA 3900 

GTGATATTGG CCTTGCCGAC CTCCGAAGCA AACTCTCTAT CATTCCTCAA GAGCCGGTGC 3960 

TGTTCAGTGG CACTGTCAGA TCAAATTTGG ACCCCTTCAA CCAGTACACT GAAGACCAGA 4020 

TTTGGGATGC CCTGGAOAGG ACACACATGA AAGAATGTAT TGCTCAGCTA CCTCT6AAAC 4080 

TTGAATCTGA AGTGATGGAG AATGGGGATA ACTTCTCAGT GGGGGAACGG CAGCTCTTGT 4140 

GCATAGCTAG AGCCCPGCTC CXSCCACTOTA AGATTCTGAT TTTAGATGAA GCCACAGCTG 4200 

CCATGGACAC AGAGACAGAC TTATTGATTC AAGAGACCAT CCGAGAAGCA TTTGCAGACT 4260 

GTACCATGCT 6ACCATTGCC CATCXJCCTGC ACACGGTTCT AGGCTCCXy^T AGGATTATGG 4320 

TGCTGGCCCA GGGACAGGTG GTGGAGTTTG ACAOCCCATC GGTCCTTCTG TCCAACGACA 4380 

GTTCCCXyVTT CTATGCCATG TTTGCTGCT6 CAGAGAACAA GGTCGCTGTC AAGGGCTGAC 4440 

TCCrCCCTGT TGACGAAGTC TCTTTTCTTT AGAGCATTGC CATTCCCTGC CTGGG60CSGG 4S00 

CCCCTCATCG CGTCCTCCTA CCGAAACCTT GCCTTTCTCG ATTTTATCTT TOGCACAGCA 4560 

GTTCCGGATT GGCTTGTGTG TTTCACTTTT AGGGAGAGTC ATATTTTGAT TATTGTATTT 4620 

ATTCCATATT CATGTAAACA AAATTTAGTT TTTGTTCTTA ATTGCACTCT AAAAGGTTCA 4680 

GGGAACCGTT ATTATAATTG TATCAGAGGC CTATAATGAA GCTTTATACG TGTAGCTATA 4740 

TCTATATATA ATTCTGTACA TAGCCTATAT TTACAGTGAA AATGTAAGCT GTTTATTTTA 4800 

TATTAAAATA AGCACTGTGC TAATAACAGT GCATATTCCT TTCTATCATT TTTGTACAGT 4860 

TtGCT GT ACT AGAGATCTGG TTTTGCTATT AGACTGTAGG AAGAGTAGCA TTTCATTCrT 4920 

CTCTAGCTGG TGGTTTCACG GTGCCAGGTT TTCTGGGTGT CCAAAGGAAG ACGTGTGGCA 4980 

ATAGTGGGCC CTCCGACAGC CCCCTCTGCC GCCTCCCCAC AGCCGCTCCA GGGGTGGCTG 5040 

GAGA0GGGT6 GGCGGCTGGA GACCATGCAG AGCGCCXTTGA GTTCTCAGGG CTCCTGCCTT 5100 

CTGTCXrrGGT GTCACTTACT GTTTCT6TCA GGAGAGCAGC GGGGGGAAGC CCAGGCCCCT SI 60 

TTTCACTCCC TCCATCAAGA ATGGG6ATCA CAGAGACATT CCTCCGAGCC GGGGAGTTTC 5220 

TTTCCrcCCT TCTTCTTTTT GCTGTTGTTT CTAAACAAGA ATCAGTCTAT CCACA6AGAG 5280 

TCCCACTGCC TCAGGTTCCT ATGGCTGGCC ACTGCACAGA GCTCTCCAGC TCCAACAOCT 5340 

GTTGGTTCCA AGCCCTGGAG CCAACTGCTG CTTTTTGAGG TGGCACTTTT TCATTTGCCT 5400 

ATTCCCACAC CTCCACAGTT CAGTGGCAGG GCTCAGGATT TCGTGGGTCT GTTTTCCTTT 5460 

CTCACOSCAG TCGTCGCACA GTCTCTCTCT CTCTCTCCCC TCAAAGTCTG CAACTTTAAG 5520 

CAGCTCTTGC TAATCAGTGT CTCACACTGG OGTAGAAGTT TTTGTACTGT AAAGAGACCT 5580 

AOCTCAGGTT GCrGGTTGCT GT6TGGTTTG GXGTGTTCCC GCAAACCOCC TTTG TGCTG T 5640 

GGGGCTGGTA GCTCAGGTGG GCGTGGTCAC TGCTGTCATC AGTTGAATGG TCAGGGTTGC 5700 

ATGTCGTGAC CAACTAGACA TTCTGTOGCC TTAGCATGTT TGCTGAACAC CTTGTGGAAG 5760 

CAAAAATCTG AAAATGTGAA TAAAATTATT TrCGATTTTC TAAAAAAAAA AAAAAAAAAA 5820 
AAAAAAAAAA AAAAAAAA 

Seq 10 KO: 585 Protein sequence 
Protein Accession ft: NP_005679.l 

1 11 21 31 41 51 

I I I i i I 

MKDIOIGKEY IIPSPGiTRSV RERTSTSGTH RDREDSKFRR TRPLECQDAL BTAARAEGLS 60 

U>ASMHSQLR IU3EEUPKGK YHHGLSAUCP IRTTSKHQHP VX31IAGLFSCM TFSWLSSXaAR 120 

VAHK8GELSM EDVWSLSK8E SSOVNCRRIiE RUHQEELMEV GPDAASLRRV VWZFCBTRLI 180 

LSIVCLMITQ LAGPSGPAFM VKKLLEyTQA TESHLQYSLIi LVLGLLLTEI VRSWSLALTW 240 

AIJJYRTGVRL RGAILTMAFK KILFCLKNIKE KSIiGELINIC SNDGQRMFEA AAVGSUAGG 300 

PWAILGMIY NVIILGPTGP LGSAVFILFY PAMMFASRLT AYFRRKCVAA TDERVQKMNE 360 

VLTYIKFIKM YAWVKAPSQS VQKIREEERR ILEKAGYFQG ITVGVAPIW VIASWTPSV 420 

HMTLGPDLTA AQAPTWTVP NSMTFALKVT PFSVKSLSEA SVAVDRFKSL FLMEEVHMIK 480 

MKPASPHIKI EMKNATLAWD SSHSSIQNSP KLTPKMKKDK RASRGKKEICV RQWJRTEHQA 540 

VLAEQKGKLL LDSDERPSPE EEEGKHIHLG HLRLQRTLHS ZDLEIQEGKL VGICGSVGSG 600 

KTSLISAILG QMTLLEGSIA ISGTPAYVAQ QAHILNATLR DMILPGKEYD EBRYNSVUTS 660 

CCLRPDLAIL PSSDLTEIGE RGANLSGGQR QRISLARALY SDRSIYIU3D PLSAX^IAHVG 720 

NHIFNSAIRK HLKSKTVLFV THQLQYLVDC DEVIFMKBGC ITERGTHEEL MSIWGDYATI 780 

FNNLLLGETP PVEZNSKKET SGSQIOCSQDX GPKTOSVKKB KAVKPESGQIi VOLBBKGQGS 840 
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VPWSVYGVYI QAAGGPLAFL VIMALFKUIV GSTAFSTWWL SYWIKQGSOI TTVTRXaiETS 900 

VSDSMKDNPH KQYYASIYAL SMAVMLXLKA IRGWFVKGT LRASSRLHDB LPBRIItRSFM 960 

KFFDTTPTGR ILMRFSXDMD EVDVBLPFQA EKFZQHVILV PFCVGMIAGV FPWFLVAVGP 1020 

LVILFSVLHI VSRVLIRELK {ILI&?ITQSP? LSHZTSSIQG lATIHATWKB QSFLHRYQEL 1080 

U3DNQAPFFL PTCAMRWIAV RLDLISIALI TTTGLMIVLM HGQIPPAYAG LA ISYRV QLT 1140 

GLFCFTVRLA SETEARFTSV ERINHYIKTL SLEAPARIKN KAPSPDWI«QB GEVTFENAEM 1200 

RYRENLPLVL KKVSFTIKPK EKIGIVGaTG SGSSSLGMAL FRLVELSGGC IKIDGVRISD 1260 

IGIADLRSKIi SIIPQEPVLF SGTVRSNLDP FKQYTEDQIW DMiERTEMKE CIAQIiPIiKIiE 1320 

SEVMEKOaiP SVGERQLLCI ARALLRHOCI LILDSATAAM DTBTDLLIQE TIREAFADCT 1380 
KLTIAERIiHT VLGSDRIMVL AQGQWEPDT PSVLLSKDSS KFXAKPAAAB KKVAVKG 



Seq ID NO: 586 D2IA sequence 

Nucleic Acid Accession ft: NMJ)01327.1 

Coding sequence: 89.. 631 

1 11 21 31 41 SI 

) I I I i I 

AGCAGGGGGC GCTGTGTCTA CCGAGAATAC GAGAATACCT CGTGGGCCCT GACCTTCTCT 60 

CTGAGAGCCG GGCAGAGGCT CCGGAGCCAT GCAGGCCGAA GGCCGGGGCA CAGGGGGTTC 120 

GACGGGCGAT GCTGATGGCC CAGGACGCCC TGGCATTCCT GATGGCCCAG GGGGCAATGC 180 

TGGOGGCCCA GGAGAQGCGG GTGCCAOQGG OGGCAGAGGT CSCCCGGGGCG CAGGGGCftGC 240 

AAGGGCCTCG GGGCCGGGAG GAGGOGCCCC GCGGGGTCOG CATGGOGGOG OGGCTTCAGG 300 

CCTGAATGGA TGCTGCAGAT GCGGGGCXy^G GGGGCCGGAG AGCOGCCTGC TTGAGTTCTA 360 

CCTCGCCATG CCTTTCGC3GA CACCCATGGA AGCAGAGCTG GCCCGCAGGA GCCTGGCCCA 420 

G6ATGCCCX» COGCTTCCOG TGCCAGGGGT GCTTCTGAAG GA6TTCACTG TGTCCGGCAA 480 

CATACTGACT ATCCGACTGA CTGCTGCAGA CCACCGCCAA CT GCAGCT CT CC ATCAGCTC 540 

CTGTCTCCAG CAGCTTTCCC TGTTGATGTG GATCACGCAG TGCTTTCTGC CCGTGTTTTT 600 

GGCTCAGCCT CCCTCAGGGC AGAGGCGCTA AGCCCAGCCT GGOGCCOCTT OCTAGGTCAT 660 

GCCTCCTCCC CTAGQGAATG GTCCXy^GCAC GAGTGGCXZAG TTCATTCTGG GGGCCTGATT 720 
GTTTGTCGCT GGAGGAGGAC GGCTTACATG TTT6TTTCTG TAGAAAATAA AACTGAGCTA 

Seq ID NO: 587 Protein sequence 
Protein Accession ft: NP_001318.1 

1 11 21 31 41 51 

1 1 1 i I I 

MQAEGRGTGG STGDADGPGG PGIPDGPGGN AGGPGEAGAT GGRGPRGAGA ARASGPGGGA 60 
PRGPHGGAAS GLNGCCRCGA RGPESRLLSF YLAMPFATPM EAELARRSLA QDAPPLPVPG 120 
VLLKEPTVSG MILTIBLTAA DHRQLQLSIS SCLQQLSLLM WITQCFLPVP lAQPPSGQRR 



Seq ID NO: 588 DNA sequence 

Nucleic Acid Accession ft; Eos sequence 

Coding sequence : 52 . . 459 

1 11 21 31 41 51 

i I lit! 

CCTOGTGGGC CCTGACCTTC TCTCTGAGAG CCGGGCAGAG GCTCCG6AGC CATGCAGGCC 60 
GAAG6CCAGG GCACAGGGGG TTOGACGGGC GATGCTGATO GCCCAGGAGG CCCTQGCATT 120 

CCTQATGGCC CAGGGGGCAA TGCTGGCGGC CCAGGAGAGG CGGGTGCCAC GGGCGGCAGA 180 
GGTCCCOGGG GCGCAGGGGC AGCAAGGGCC TCGGGGCCGA GAGGAGGOGC CCCGOGGGGT 240 
CCGCATGGOG GTGCCGCTTC TGCGCAGGAT GGAAGGTGCC CCTGCGGGGC CAGGAGGCCG 300 
GACAGCOGCC TGCTTCAGTT CGGACTGACT GCTGCAGACC ACOGCCAACT GCAGCTCTCC 360 
ATCAGCTCCT GTCTCCAGCA GCTTTCCCTG TTGATGTGGA TCACGCAGTG CTTTCTGCCC 420 
GTGTTTTTGG CTCAGGCTCC CTCAGGGCAG AGGCGCTAAG CCCAGCCTGG CGCCCCTTCC 480 
TAGGTCATGC CTCCTCCCCT AGGGAATGGT CCCAGCAOGA GTGGCCAGTT CArTGTGGGG 540 
GCCTGATTGT TTGTCGCTGG AGGAGGAOGG CTTACAT6TT TCTTTCTGTA GAAAATAAAG 600 
CTGAGCTA 



Seq ID NO: 589 Protein sequence 
Protein Accession ft: Eos sequence 



1 11 21 31 41 51 

1 ] I 1 I i 

MQAEGQGTGG STGDADGPGG PGIPDGPGOI AGGPGEAGAT GGRGPRGAGA ARASGPRGGA 60 
PRGPHGGAAS AQOGRCPCGA RRPDSRLLQP RLTAADKRQL QLSISSCLQQ ZiSIiUIWITQC 120 
FLPVFLAQAP SGQRR 



Seq ID NO: 590 DNA sequence 
Nucleic Acid Accession ft: NM_005562.1 
Coding sequence: 90.. 3671 



1 11 21 31 41 51 

1 1 I I 1 I 

ACAGCGGAGC GCAGAGTGAG AACCACCAAC CGAGGGGCCG 6GCAGCGAGC OCTGCAGCGG 60 

AGACAGAGAC TGAGCGGCCC GGCACCGCCA TGCCTGCGCT CTGGCTGGGC TGCTGCCTCT 120 

GCTTCTCGCT CCTCCTGCCC GCAGCCOSGG CCACCTCCAG GAGGGAAGTC TGTGATTGCA 180 

ATGGGAAGTC CAGGCAGTGT ATCTTTGATC GGGAACTTCA CAGACAAACT GGTAATGGAT 240 

TCOGCTGCCT CAACTGCAAT GACAACACTG ATGGCATTCA CTGCGAGAAG TGCAAGAATG 300 

GCTTTTACCG GCACAGAGAA AGGGACCGCT GTTTGCCCTG CAATTGTAAC TCCAAAGGTT 360 

CTCTTAGTGC TCGATGTGAC AACTCTGGAC GGT6CAGCT6 TAAACCAGGT GTGACAGGAG 420 

CCAGATGCGA CCGATGTCTG CCAGGCTTCC ACATGCTCAC GGATGOGGGG TGCACCCAAG 480 

AGCAGAGACT GCTAGACTCC AAGTGTGACT GTGACCCACC TGGCATCGCA GGGCCCTGTG 540 

ACGGGGGCCG CTGTGTCTGC AAGCCAGCTG TTACTGGAGA ACGCTGTGAT AGGTGTCGAT 600 

CAGGTTACTA TAATCTGGAT GGGGGGAAOC CTGAGGGCTG TACCCAGTGT TTCTGCTATG 660 

GGCATTCAGC CAGCTGCCGC AGCTCTGCAG AATACAGTGT CCATAAGATC ACCTCTACCT 720 

TTCATCAAGA TGTTGATGGC TGGAAGGCTG TCCAACGAAA TGGGTCTCCT GCAAAG CTCC 780 

AATGGTCACA GCGCCATCAA GATGTGTTTA GCTCAGCCCA ACGACTAGAC OCTGTCTATT 840 
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TTCTCGCTCX: •reCCAAATTT CTTGGGAATC AACAGGTGAG CTATGGGCAA AGCCTGTCCT 900 

TTGACTACOG TGTGGACAGA GGAGGCAGAC ACCCATCTGC CCVTGATGTG ATTCTGGAAG 960 

tffGC I G G TCT AGGQMrCACA GCTCCCTTGA TGCOVCTTOG CAAGACACTG OCTTCTQGGC 1020 

TCACTAAGAC TTACACATTC AGGTTAAATG AGCATOCAAG CAATAATTGG AGCCOOCACC 1080 

TGAGTTACTT TGW3TATOGA AGGTTACTGC GGAATCTCAC AGCCCTCOGC ATCCG AGCTA 1140 

CATATGGAGA ATACAGTACT GGGTACATTG ACAATGTGAC CCTCATTTCA GCCCGCOCTG 1200 

TCTCTGGAGC CCCAGCACCC TGQGTTGAAC AGTGTATATC TCCTGTTGGG TAC AAGQGGC 1260 

AATTCIGCCA G6ATTGTOCT TCTGGCTACA AGAGACATTC AGCGAGACTG GGGCCTTTTG 1320 

GCACCTGTAT TCCTTGTAAC TGTCAAGGGG GAGGGGCCTG TGATCCAGAC ACAGGAGATT 1380 

GTTATTCAGG GGATGAGAAT CCTGACATTG AiSTGTGCTGA CTGCCCAATT GGTTTCrACA 1440 

ACGATCCGCA CGACCCCCGC AGCTGCAAGC CATGTCCCTG TCATAACGGG TTCA GCIGCT ISOO 

CAGTGATGCC GGAGACGGAG GAGGTGGTGT GCAATAACTG CCCTOCCGGG GTCACCGGTG 1560 

CCCGCTGTGA GCTCTGTGCT GATGGCTACT TTGGGGACCC CTTTGGTGAA CATGGCCCAG 1620 

TGAGGCCTTG TCAGCCCTCT CAATGCAACA ACAATGTGGA CCCCAGTGCC TCTGGGAATT 1680 

GTGACCGGCT GACAGGCAGG TCTTTGAAGT GTATCCACAA CACAGCCX3GC ATCTA CTGOG 1740 

ACCAGTGCAA AGCAGGCTAC TTCGGGGACC CATTGGCXXX: CAAOCCRfiCA GACAAGTCTC 1800 

GAGCTTCCAA CTGTAACCCC ATGGCCTCAG AGCCTGTAGG ATGTCGAAGT GATGGCACCT I860 

GTGTTTGCAA GCCAGGATTT GGTGGCCCCA ACTGTGAGCA TGGAGCATTC AGCTGTCCAG 1920 

CrrcCTATAA TCAAGTGAAG ATTCAGATGG ATCAGTTTAT GCAGCAGCTT CAGAGAATGG 1980 

AGGCCCTGAT TTCAAAGGCT CAGGGTGGTG ATGGAGTAGT ACCTGATACA GAGCTQGAAG 2040 

GCAGGATGCA GCAGGCTGAG CAGGCCCTTC AGGACATTCT GAGAGATGCC CAGATTTCAG 2100 

AAOGTGCTAG CAGATCCCTT GGTCTCCAGT TGGCCAAGGT GAGGAGCCAA GAGAACAGCT 2160 

ACCAGAGCCG CCTGGATGAC CTCAAGATGA CTGTGGAAAG AGTTCGGGCT CTGGGAAGTC 2220 

AGTACCAGAA CCGAGTTCGG GATACTCACA GGCTCATCAC TCAGATGCAG CT GAgCC TGG 2280 

CAGAAAGTGA AGCTTCCTTG GGAAACACTA ACATTCCTGC CTCA6ACCAC T AOGTGG GGC 2340 

CAAATGGCTT TAAAAGTCTG GCTCAGGAGG CCACAAGATT AGCAGAAAGC CACGTTGAGT 2400 

CAfiCCAGTAA CATGGAGCAA CTGACAAGGG AAACTGAGGA CTATTCCAAA CAAGCCCTCT 2460 

CACTCGTGCG CAAGGCOCTO CATGAAGGAG T0GGAAG06G AAGCGGTAGC COSGAOGGTG 2520 

CTGTGGTGCA AGGGCTTGTG GAAAAATCGG AGAAAACCAA GTOCCTGGOC CAGCAOTTGA 2580 

CAAGGGAGGC CACTCAAGCG GAAATTGAAG CAGATAGGTC TTATCAGCAC AGTCTCOGCC 2640 

TCCTGGATTC AGTGTCTCGG CTTCAGGGAG TCAGTGATCA GTCCTTTCAG GTGGAAGAAG 2700 

CAAAGAGGAT CAAACAAAAA GCGGATTCAC TCTCAACGCT GGTAACCAGG CATATGGATG 2760 

AGTTCAAGCG TACACAAAAG AATCTGGGAA ACTGGAAAGA AGAAGCACAG CAGCTCTTAC 2820 

AGAATCGAAA AAGTGGGAGA GAGAAATCAG ATCAGCTGCT TTCCCGTGCC AATCTTGCTA 2880 

AAAGCAGAGC ACAAGAAGCA CTGAGTATGG GCAATGCCAC TTTTTATGAA GTTGAGAGCA 2940 

TCCTTAAAAA CCTCAGAGAG TTTGACCTGC AGGTGGACAA CAGAAAAGCA GAAGCTGAAG 3000 

AAGCCATCAA GAGACTCTCC TACATCAGCC AGAAGGTTTC AGATGCCAGT GACAAGACCC 3060 

AGCAAGCAGA AAGAGCCCTG GGGAGOGCTG CTGCTGATGC ACAGAGGGCA AAGAATGGGG 3120 

CCGGGGAGGC CCTGGAAATC TCCAGTGAGA TTGAACAGGA GATTGGGAGT CTGAACTTGG 3180 

AAGCCAATGT GACAGCAGAT GGAGCCITGG CCATGGAAAA GGGACTGGCC TCTCTGAAGA 3240 

GTCAGATGAG GGAAGTGGAA GGAGAGCTGG AAAGGAAGGA GCTGGAGTTT GACAOGAATA 3300 

TGGATCCAGT ACAGATGGTG ATTACAGAAG CCCAGAAGGT TGATACXaGA GCCAAGAACG 3360 

CTGGGGTTAC AATCCAAGAC ACACTCAACA CATTAGACGG CCTCCTGCAT CTGATGGACC 3420 

AGCCTCTCAG TGTAGATGAA GAGGGGCTGG TCTTACTGGA GCAGAAGCTT TCCCGAGCCA 3480 

AGACCCAGAT CAACAGCCAA CTGCC3GCCCA TGATGTCAGA GCTGGAACAG AGGGCACGTC 3540 

AGCAGAGGGG CCACCTCCAT TTGCTGGAGA CAAGCATAGA TGGGATTCTG GCTGATGTGA 3600 

AGAACTTGGA GAACATTAGG GACAACCTGC COCCAGGCTG CTACAATACC CAGGCTCTTG 3660 

AGCAACAGTG AAGCTGCCAT AAATATTTCT CAACTGAGGT TCTTGGGATA CAGATCTCAG 3720 

GGCTCGGGAG CCATGTCATG TGAGTGGGTG GGATGGGGAC ATTTGAACAT GTTTAATGGG 3780 

TATGCTCAGG TCAACTGACC TGACCCCATT CCTGATCOCA TGGCCAGGTG GTTGTCTTAT 3840 

TGCACCATAC TCCTTGCTTC CTGATGCTGG GCAATGAGGC AGATAGCACT GGGTGTGAGA 3900 

ATGATCAAGG ATCTGGACCC CAAAGAATAG ACTGGATGGA AAGACAAACT GCACAGGCAG 3960 

ATGTTTGCCT CATAATAGTC GTAAGTGGAG TCCTGGAATT TG GACAA GTG CTGTTGGGAT 4020 

ATAGTOVACT TATTCTTTGA GTAATGTGAC TAAAGGAAAA AACTTTGACT TTGGCCAGGC 4080 

ATGAAATTCT TCCTAATCTC AGAACAGA6T GCAACXXafiT CACACTGTGG CCAOTAAAAT 4140 

ACTATTGCCT CATATTGTCC TCTGCAAGCT TCTTGCTGAT CAGM3TTCCI CCTACTTACA 4200 

ACCCAGGGTG TGAACATGTT CTCCATTTTC AAGCTGGAAG AAGTGAGCAG TGTTGGAGTG 4260 

AGGACCTGTA AGGCAGGCCC ATTCAGAGCT ATGGTGCTTG CTGGTGCCTG CCACCTTCAA 4320 

GTTCTGGACC TGGGCATGAC ATCCTTTCTT TTAATGATGC CATGGCAACT TAGAGATTGC 4380 

ATTTTTATTA AAGCATTTOC TACCAGCAAA GCAAATCTTG CGAAAGTATT TACTTTTT06 4440 

GTTTCAAAGT GATAGAAAAG TGTGGCTTGG GCATTGAAAG AGGTAAAATT CTCTA GATTT 4500 

ATTAGTCCTA ATTCAATCCT ACTTTTCGAA CACCAAAAAT GATGCGCATC AATGTATTTT 4560 

ATCTTATTTT CTCAATCTCX: TCTCTCTTTC CTCCACCCAT AATAAGAGAA TGTTCCTACT 4620 

CACACTTCAG CTGGGTCACA TCCATCCCTC CATTCATCCT TCCATCCATC TTTCCATOCA 4680 

TTACCTCCAT CCATCCTTCC AACATATATT TATTGAGTAC CTACTGTGTG CCAGGGGCTG 4740 

GTCGGACAGT GGTGACATAG TCTCTGCCCT CATAGAGTTG ATTGTCTAGT GAGGAAGACA 4800 

AGCATTTTTA AAAAATAAAT TTAAACTTAC AAACTTTGTT TGTCACAAGT GGTGTTTATT 4860 

GCAATAACCG CTTGGTTTGC AACCTCTTTG CTCAACAOAA CATATGTTGC AA GAOCXTCC 4920 

CATGGGGGCA CTTGAGTTTT GGCAAGGCTG ACAGAGCTCT GGGTTGTGCA CATTTCTTTG 4980 

CATTCCAGCT GTCACTCTGT GCCTTTCTAC AACTGATTGC AACAGACTGT TGAGTTATGA 5040 

TAACACCAGT GGGAATTGCT GGAGGAACCA GAGGCACTTC CACCTTGGCT GGGAAGACTA 5100 

TGGTGCTGCX; TTGCTTCTGT ATTTCCTTGG ATTTTCCTGA AAGTGTTTTT AAATAAAGAA 5160 
CAATTGTTAG ATGOC 

Seq 10 NO: 591 Protein sequence 
Protein Accesaion «: WP_00SS53.1 

1 11 21 31 41 51 

1 i I 1 I I 

MPALWLGCCL CFSLLLPAAR ATSRREVCDC NGKSRQCIFD RELHRQTGHG FRCLNCNDNT 60 

DGIHCEKCKN GFYRHRERDR CLPCKCNSKG SLSARCDNSG RCSCKPGVTG ARCDRCLPGF 120 

HMLTDAGCTO DQRLLDSKCD CDPAGIAGPC DAGRCVCKPA VTGERCDRCR SGYYNLDGGN 180 

PBGCTQCFCY GHSASCRSSA EYSVHKITST FHQDVDGWKA VQRKGSPAKL QWSQRHQDVP 240 

SSAQRLDPVY FVAPAKFIjGN QQVSYGQSLS FDYRVDRGGR HPSAHDVILE GAGLRITAPL 300 

MPU5KTLPCG LTKTYTPRLM EHPSNKHSPQ LSYFElfRRLL RNLTALRIBA TYGEYSTGYI 360 

ONVTLISARP VSGAPAPWVE QCICPVGYKG QPOQDCASGY KRDSARUSPP GTCIPOroOG 420 

GGACDPDTGD CYSGDEMPDI ECADCPIGFY NDEHDPRSCK PCPOWGFSC SVMPETEBW 480 
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CKNCPPGVTC ARCELCADGY FGDPFGaiG? VRPOQPOQOI KSVDPSASGB aSRLTGROtK 540 

CIHNTAGIYC DQCKAGYFGD PIAPNPADKC HAQICKPKGS BPVGCRSDGT CVCKPGFGGP 600 

KCEHGAFSCP ACYHQfTKIQM DQFMQQLQRM EALISKAQGG DGWPDTELE GRKQQAEQAL 660 

QDILRDAQXS EGASRSLGLQ LAKVRSQENS YQSRUJOliIM TVERVRALGS QYQNaVSDTH 720 

RLITQMQLSL AESEASLGNT NIPASDHYVG PNGPKSIiAjQE ATRLABSHVB SASNMSQLTR 780 

ETEDYSKQAL SLVRKALHBG VGSGSGSFDG AWQGLVEKL EKTKSIiAQQIi TREATQAEIE 840 

ADRSYQHSLS LLDSVSRLQG VSDQSFQVEE AKHIKQKADS LSTLVTRHHD EFKRTQKSLG 900 

NWKEEAQQLL QNGKSGREKS DQI^LSRANLA KSRAQSAI*SM GSATPYEVES ILKNLaEPDL 960 

QVDNRKA2AE EAMKRLSYIS QKVSDASDKT QQAERALGSA AADAQRAKNG AGEAI/EISSE 1020 

IBQEIGSLNL EAMVTADGAL AMBKGLASLS SEMRBVEGBL ERKBUSFDTO «2>AVQMVITB lOSO 

AQKVirniAKN AGVTIQDTLN TUXaLHLKD QPLSVDEEGI, VtLEQKLSRA KTQINSQLRP 1140 

^e4SELEE^^AS qqrghlhlle tsidgiladv shlenisdstl ppgcvntqal bqq 

Seq ID NO J 592 DNA sequence 
Nucleic Acid Accession S: AF101051.1 
Coding sequence: 221.856 

1 11 21 31 *1 51 

GAGCAACCTC LcTTCTAGT ATCCAGACTC CAGCGCCGCC CCGGGCGCGG ACCCCAACCC 60 

CGACCCAGAG CTTCTCCAGC GGOGGCGCAG CGACCAGGGC TCCCCGCCTT AACTTCCTCC 120 

GCGGGGCCCA GCCACCTTCG GGAGTCOGGG TTGCCCACCT GCAAACTCTC CGCCTTCTGC 180 

ACCTGCCACC CCTCAGCCAG CGCGGGOGCC CGAGOGAGTC AIGGCX^AOG CGGGGCTGCA 240 

GCTGTTCGGC TTCATTCIOG CCTTCCTGGG ATGGATOGGC GCCATOGTCA GCACTGCCCT 300 

GCCXX»GTCG AGGATTTACT CXTTATGCCGG OCACAACATC GTGACCGCXX: AGGCCATGTA 360 

OGAGGGGCTG TGGATCTCCT GCGTGTCGCA GAGCACOSGG CAGATCCRffT GCAAAGTCTT 420 

TGACTCCTTC CTGAATCTGA GCAGCACATT GCAAGCAACX: CGTGCCTTCA TGGTCGTTGG 480 

CATCCTCCTG GGAGTCATAG CAATCTTTGT GGCCACCGTT GGCATGAAGT GTATGAAGTG 540 

CTTGGAAGAC GATGAGQTGC AGAAGATGAG GATGGCTGTC ATTG6GGGTG CGATATTTCT 600 

TCTTGCaiGGT CTGGCTATTT TAGTTGCCAC AGCATGGTAT GGCAATAGAA TCGTTCAAGA 660 

ATTCTATGAC CCTATCACCC CAGTCAATGC CAG6TACGAA TTTGGTCAGG CTCTCTTCAC 720 

TGGCTGGGCr GCTGCTTCTC TCTGCCTTCT GQGAGGTGCC CTACTTTGCT GTTCCTGTCC 780 

CCGAAAAACA ACCTCTTACC CAACACCAAG GCCCTATCCA AAACCTGCAC CTTCCAGCGG 840 

GAAAGACTAC GTGTGACACA GAGGCAAAAG GAGAAAATCA TGTTGAAACA AACCGAAAAT 900 

GGACATTGAG ATACTATCAT TAACATTAGG ACCTTAGAAT TTTGGGTATT GTAATCTGAA 960 

GTATGGTATT ACAAAACAAA CAAACAAACA AAAAACCCAT GTGTTAAAAT ACTCAGTGCT 1020 

AAACATGGCT TAATCTTATT TTATCTTCTT TCCTCAATAT ACGAGGGAAG ATTTTACCAT 1080 

TTGTATrACT GCTrCCCATT GAGTAATCAT ACTCRAATGG GGGAAGG6GT GCTCCTTAAA 1140 

TATATATAGA TATGTATATA TACATGTTTT TCTATTAAAA ATAGACAGTA AAATACTATT 1200 

CrCATTATGT TGATACTAGC ATACTTAAAA TATCTCTAAA ATAGGTAAAT GTATTTAATT 1260 

CCATATTGAT GAAGATGTTT ATTGGTATAT TTTCTTTTTC GTCCTTATAT ACATATGTAA 1320 

CAGTCAAATA TCATTTACTC TTCTTCATTA GCTTTGGGTG CCTTTGCCAC AAGACCTAGC 1380 

CTAATTTACC AAGGATGAAT TCTTTCAATT CTTCATGCGT GCCCTTTTCA TA TACTTA TT 1440 

TTATTTTTTA CCATAATCTT ATAGCACTTG CATCGTTATT AAGCCCTTAT TTGTTTTGTG 1500 

TTTCATTGGT CTCTATCTCC TGAATCTAAC ACATTTCATA GCC TACATT T TAOTTTCTAA 1560 

AGCCAAGAAG AATTTATTAC AAATCAGAAC TTTGGAGGCA AATCTTTCTG CATGACCAAA 1620 

GTGATAAATT CCTGTTGACC TTCCCACACA ATCCCTGTAC TCTGACCCAT AGCACTCTTG 1680 

TTTCCTTTGA AAATATTTGT CCAATTGAGT AGCTGCATGC TGTTCCCCCA GGTGTTGTAA 1740 

CACAACTTTA TTGATTGAAT TTTTAAGCTA CTTATTCATA GTTTTATATC CCCCTAAACT 1800 

AOCTTTTTCT TCCCCATTCC TTAATTGTAT TCTTTTCCCA AGTGTAATTA TCATGCGTTT 1860 

TATATCTTCC TAATAAGGTG TGGTCTGTTT GTCTQAACAA AGTGCTAGAC TTTCTGGAGT 1920 

GATAATCTGG TGACAAATAT TCTCTCTGTA GCTGTAAGC3V AGTCACTTAA TCTTTCTACC 1980 

TCTTTTTTCT ATCTGCCAAA TTGAGATAAT GATACTTAAC CABTTAGAAG AGGTAGTGTG 2040 

AATATTAATT AGTTTATATT ACTCTCATTC TTTGAACATG AACTATGCCT ATGTAGTGTC 2100 

TTTATTTGCT CAGCTCGCTG AGACACTGAA GAAGTCACTG AACAAAACXT ACACACGTAC 2160 

CTTCATGTGA TTCACTGCCT TCCTCTCTCT ACCAGTCTAT TTCCACTGAA CAAAACCTAC 2220 

ACACATACCT TCATGTGGTT CAGTGCCTTC CTCTCTCTAC CAGTCTATTT CCACTGAACA 2280 

AAACCTACX5C ACATACCTTC ATGTGGCTCA GTGCCTTCCT CTCTCTACCA CTCTATTTCC 2340 

ATTCTTTCAG CTGTGTCTGA CATGTTTGTG CTCTGTTCCA TTTTAACAAC TGCTCTTACT 2400 

TTTCCAGTCT GTACAGAATG CTATTTCACT TGAGCAAGAT GATGTATGGA AAGGGTGTTG 2460 

GCACTGGTGT CTGGAGACCT GGATTTGAGT CTTGGTGCTA TCAATCACCG TCTGTGTTTG 2520 

AGCAAGGCAT TTGGCTGCTG TAAGCTTATT GCTTCATCTG TAAGCGGTGG TTTGTAATTC 2S80 

CTGATCTTCC CACCTCACAG TGATCTTGTG GGGATCCAGT GAGATAGAAT ACATGTAAGT 2640 

GTGGTTTTGT AATTTGAAAA GTGCTATACT AAGGGAAAGA ATTGAGGAAT TAACTGCATA 2700 

CXrrTTTGGTG TTCCTTTTCA AATCTTTGAA AATAAAAAAA TGTTAAGAAA TGGGTTTCTT 2760 

GCCTTAACCA GTCTCTCAAG TGATGAGACA GTGAAGTAAA ATTGAG TGCA CTAAAOGAAT 2820 

AAGATTCTGA GGAAGTCTTA TCTTCTGCAG TGAGTATGGC CCAATGCTTT CTGTGGCTAA 2880 

ACAGATGTAA TGGGAAGAAA TAAAAGCCTA CGTGTTGGTA AATCCAACAG CAAGGGAGAT 2940 

TTTTGAATCA TAATAACTCA TAAGGTGCTA TCTGTTCAGT GATGCCCTCA GAGCTCTTGC 3000 

TGTTAGCTGG CAGCTGACGC TGCTAGGATA GTTAGTTTGG AAATGGTACT TC ATAAT AAA 3060 

CTACACAAGG AAAGTCAGCC ACOGTGTCTT ATGAGGAATT- GGACCTAATA AATTTTAGTG 3120 

TGCCTTCCAA ACCTGAGAAT ATATGCTTTT GGAAGTTAAA ATTTAAATGG CTTTTGCCAC 3180 

ATACATAGAT CTTCATGATG TGTGAGTGTA ATTCCATGTG GATATCAGTT ACCAAACATT 3240 

ACAAAAAAAT TTTATGGCCC AAAATGACCA ACGAAATTGT TACAATAGAA TTTATCCAAT 3300 

TTTGATCTTT TTATATTCTT CTACCACACC TGGAAACAGA CCAATAGACA TTTTGGGGTT 3360 

TTATAATCGG AATTTGTATA AAGCATTACT CTTTTTCAAT AAATTGTTTT TTAATTTAAA 3420 
AAAAGGAAAA AAAAAAAAAA AAA 

Seq ID NO: 5 93 Protein sequence 
Protein Accession ft: AAD16433.1 

1 11 21 31 41 51 

MAKAGLQLLG FILAFLGWIG AIVSTALPQW RIYSYACTNI VTAQAMYEXSL HMSCVSQSTG 60 

QIQCKVFDSL UJLSSTLQAT RALMWGILL GVIAIFVATV GMKCHKOiED DEVQKMRMAV 120 

IQGAIFLLAG LAILVATAWY GNRIVQEFYD PMTPVMARYE FGQALFTGWA AASLCLLGGA 180 
LLCCSCPRKT TSYPTPHPYP KPABSSGKDY V 
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Seq ZD NO: 594 sequence 

nucleic Acid Accession ft: KM_006180.l 

Ooding sequence i 352.. 2820 

1 11 21 31 41 51 

1 i i ) I I 

CCCCCATTOG CATCTAACftA GGAATCTGOG CCCCftGAGAC TCCCGGAOGC CGCCGGTCGG 60 

TCCCCGGCGC GCOtSGGCCAT GCAGCGACGG COGCCGCGGA G CTCO GAGCA GOGGTAGCGC 120 

CCCCCTCTAA AGCGGTTCGC TATGCCGGGA CCACTGTGAA CCCTGCOGCC TGCOGGAACA 180 

CrCTTCGCTC CGGACCAGCT CAGCCTCTGA TAAGCTG6AC TGGGCA0CC3C CGCAACAAGC 240 

ACCGAGGAGT TAAGAGAGCC GCAAGCGCAG GGAAGGCCTC CCOGCAOGGG TGGGGGAAAG 300 

CX3GCCX3GTGC AGCGCGGGGA CAGGCACTOG GGCTGGCACT GGCTGCTAGG GAT GTOGTC C 360 

TOGATAAGGT GGCATGGACC OGCCATGGCG CGGCTCTGGG GCTTCTGCTG GCTGGTTGTG 420 

GGCTTCTGGA GGGCCGCTTT OGCCTGTCCC ACGTCCTGCA AATGCAGTGC CTCTCGGATC 480 

TUQTCCAGOG ACCCTTCPOC TOGCATOGTG GCATTTCOGA GATTGGAGCC TAACAGTGTA S40 

GATCCTGAGA ACATCACCGA AATTTTCATC GCAAACCAGA AAAGGTTAGA AATCATCAAC 600 

GAAGATGATG TTGAAGCTTA TGTGGGACTG AGAAATCTGA CAATTGTGGA TT CTCGA TTA 660 

AAATTTGTGG CTCATAAAGC ATTTCTGAAA AACAGCAACC TGCAGCACAT CAATTTTAOC 720 

CGAAACAAAC TGACGAGTTT GTCTAGGAAA CATTTCCGTC ACCTTGACTT GTCTGAACTG 780 

ATCCTGGTGG GCAATCCATT TACATGCTCC TGTGACATTA TGTGGATCAA GACTCTCCAA 840 

GAGGCTAAAT CCRGTCCAGA CACTCAGGAT TTGTACTGCC TGAATGAAAG CAGCAAGAAT 900 

ATTCCCCTGG CAAACCTCCA GATACCCAAT TCTGGTTTGC CATCTGCAAA TCTGGCCGCA 960 

CCTAACCTCA CTGTGGAGGA AGGAAAGTCT ATCACATTRT CCTGTAGTGT GGCAGGTGAT 1020 

COGGTTCCTA ATATGTATTG GGATGTTGGT AACCTGGTTT CCAAACATAT GAATGAAACA 1080 

AGCCACACAC AGGGCTCCTT AAGGATAACT AACATTTCAT COGATGACAG TGGGAAGCAG 1140 

ATCTCTTGTG TGGCGGAAAA TCTTGTAGGA GAAGATCAAG ATTCTGTCAA CCTCACTGTG 1200 

CATTTTGCAC CAACTATCAC ATTTCTCGAA TCTCCAACCT CAGACCACCA CTGGTGCATT 1260 

OCATTCACTG TGAAAGGCAA CCCCAAACCA GCGCTTCaGT GGTTCTATAA CGGGGCAATA 1320 

TTGAATGAGT CCAAATACAT CTGTACTAAA ATACRTGTTA CCAATCACAC GGAGTACCAC 1380 

GGCTGCCTCC AGCTGGATAA TCCCACTCAC ATGAACAATG GGGACTACAC TCTAATAGCC 1440 

AAGAATGAGT ATGGGAAGGA TGAGAAACAG ATTTCTGCTC ACTTCATGGG CTGGCCTGGA 1500 

ATTGACGATO GTGCAAACCC AAATTATCCT GATGTAATTT ATGAAGATTA TGGAACTGCA 1560 

GCGAATGACA TaSOQGACAC CACGAACAGA AGTAATGAAA TCCCTTCCAC AGACGTCACT 1620 

GATAAAACOG GTCGGGAACA TCTCTCGGTC TATGCTGTGG TGGTGATTGC GTCTGTGGTG 1680 

GGATTTTGCC TTTTGGTAAT GCTGTTTCTG CTTAAGTTGG CAAGACACTC CAAGTTTGGC 1740 

ATGAAAGGCC CAGCCTCCGT TATCAGCAAT GATGATGACT CTGCCAGCCC ACTCCATCAC 1800 

ATCTCCAATG GGAGTAACAC TCCATCTTCT TCGGAAGGTG GCCCAGATGC TGTCATTATT 1860 

GGAATGACCA AGATCCCTGT CATTGAAAAT CCCCAGTACT TrGGCATCAC CAACAGTCAG 1920 

CTCAAGCCAG ACACATTTGT TCAGCACATC AAGCGACATA ACATTGTTCT GAAAAGGGAG 1980 

CTAGGCGAAG GAGCCTTTGG AAAAGTGTTC CTAGCTGAAT GCTATAACCT CTGTCCT6AG 2040 

GAGGACAAGA TCTTGGTGGC A6TGAAGACC CTGAAGGATG CCAGTGACAA TGCACX5CAAG 2100 

GACTTCCACC GTGAGGCOGA GCTCCTGACC AACCTCCaGC ATGAGCACAT CGTCAAGTTC 2160 

TATGGCGTCT GCGTGGAGGG CGACCCCCTC ATCATGGTCT TTGAGTACAT GAAGCATGGG 2220 

GACCTCAACA AGTTCCTCAG GGCACAOGGC CCTGATGCOG TGCTGATGGC TGAGGGCAAC 2280 

CCGCCCACGG AACT6AC3GCA GTOGCAGATC CTGCATATAG CCCAGCAGAT CGCOGCGGGC 2340 

ATGGTCTACC TGGGGTCCCA GCACTTOGTG CACCGCGATT TGGCCACCAG GAACTGCCTG 2400 

GTCGGGGAGA ACTTGCTGGT GAAAATOGGG GACTTTGGGA TGTCCXXWGA CGTGTACAGC 2460 

ACTGACTACT ACAGGGTCGG TGGCCACACA ATGCTGCXXa^ TTCGCTGGAT GCCTCCA 6 A G 2520 

AGCATCATGT ACAGGAAATT CACGAOGGAA AGOGAOGTCT GGAGCCTGGG GGT0GTGTT6 2580 

TGGGAGATTT TCACCTATGG CAAACAGCCC TGGTACCAGC TGTCAAACAA TGAGGTGATA 2640 

GAGTGTATCA CTCAGGGCCX3 AGTCCTGCAG CGACCCCGCA CGTGCCCCCA GGAGGTGTAT 2700 

GAGCTGATGC TGGGGTGCTG GCAGCGAGAG CCCCACATGA GGAAGAACAT CAAGGGCATC 2760 

CATACCCTCC TTCAGAACTT GGCCAAGGCA TCTCOGGTCT ACCTGGACAT TCTAGGCTAG 2820 

GGCCXrrTTTC CCCAGACOGA TCCTTCOCAA OGTACTOCTC AGAOGGGCTG AGAGGATGAA 2880 

CATCTTTTAA CTGCCGCTGG A6GCCACCAA GCTGCTCTCC TTCACTCTGA CAGTATTAAC 2940 

ATCAAAGACT CCGAGAAGCT CTCGAGGGAA GCAGTGTGTA CTTCTTCATC CATAGACACA 3000 

GTATTGACTT CTTTTTGGCA TTATCTCTTT CTCTCTTTCC ATCTCCCTTG GTTGTTCCTT 3060 

TTTCTTTrrr taaattttct 'm'rcri ' crr TTrrrrcGTC ttccctgctt cacgattctt 3120 

ACCCTTTCTT TTGAATCAAT CTGGCTTCTG CATTACTATT AACTCTGCAT AGACAAAGGC 3180 

CTTAACAAAC GTAATTTGTT ATATCAGCAG ACACTCCAGT TTGCCCACCA CAACTAACAA 3240 

TGCCTTGTTG TATTCCTGCC TTTGATGTGG ATGAAAAAAA GGGAAAACAA ATATTTCACT 3300 

TAAACTTTGT CACTTCTGCT GTACAGATAT OQAGAGTTTC TATGOATTCA CTTCTATTTA 3360 

TTTATTATTA TTACTGTTCT TATTGTTTTT GGATGGCTTA AGCCTGTGTA TAAAAAAGAA 3420 

AACTTGTGTT CAATCTGTGA AGCCTTTATC TATGGGAGAT TAAAACCAGA GAGAAAGAAG 3480 

ATTTATTATG AACCGCAATA TGGGAGGAAC AAAGACAACC ACTGGGATCA GCTGGTGTCA 3540 

GTCCCTACTT AGGAAATACT CAGCAACTGT TAGCTGGGAA GAATGTATTC GGCACC TTCC 3600 

CCTGAGGAOC TTTCTGAGGA GTAAAAA6AC TACT G GOCTC TGTGCCATGG ATGATTCTTT 3660 
TCCCATCACC AGAAATGATA GCGTGCAGTA GAGAGCAAAG ATGGCTT 

Seq ID NO: 595 Protein sequence 
Protein Accession NP_00€171.1 

1 11 21 31 41 51 

MSSHIRHH6P AMARLWGFCW LWGFHRAAF ACPTSOCCSA SRIHCSDPSP 6IVAFPRLEP 60 

NSVDPEHITE IPIANQKRLB IINEDDVEAY VGLSMLTIVD SGLKPVAHKA PLKMSNLaHI 120 

NPTRNKLTSL SRKHFRHLDL SELILVGMPP TCSCDIMWIK TLQEAKSSPD TQDLYCIjNES 180 

SKNIPLANLQ IPNCGLPSAN LAAPNLTVEE GKSITLSCSV AGDPVPNKYW DVGMLVSKHM 240 

NETSirroGSL RZTSXSSDDS GKQZSCVABN LVGEDQDSVN LTVHPAPTIT FLESPTSDHH 300 

WCIPFTVKGN PKPALQMPYM OAlLNESKVl CTKIHVTOHT SYHGCLQUSN PTHMNKCTYT 360 

I.XAKMEYGKD EKQISAHFKG WPGIDDGAMP mfPDVlYEDY GTAAHDI<H)T TNRSNEIPST 420 

DVTDKTGREH LSVYAWVIA SWGFCLLVM LFLLKLARHS KPGMKGPASV ISNDDDSASP 480 

LHHISNGSNT PSSSEGGPDA VIIGMTKIPV lENPQYFGIT NSQUCPDTFV QHIKRHNIVL 540 

KRELGEGAFG KVFIAECYNIi CPBQDKXLVA VKTLKDASDN ARiCDFHREAE LLtVU2H£KI 600 

VKFYGVCVEG DPIiIMVFEYM KHGDUnCFLR AHGPDAVLMA EQIPPTELTQ SQMLBIAQQI 660 

AAGMVYLASQ HFVBSDtATR KCLVGENLLV KIGDFGMSRD VYSTDYVRVC GHTKLP1R«M 720 



417 



wo 02/086443 

ppESDCYRKF TTBSIIVWSI/5 WU^SIFTYG KQFrfYQLSKN BVIECITQCa VLQRPSTCPQ 780 
EVYELMI/3CW QSEPHMRKNI KGIHTLLQNL AKASPVYLDI LG 

Seq ID NO: 596 DNA sequence 
Nucleic Acid Accession fi: AF410899 
Coding sequence: 483.. 2999 

1 11 21 31 41 51 

] I 1 1 i I 

GGGAGCAC5GA GCCTOGCTGG CTGCTTCX5CT CGCGCTCTAC GCCCTCASTC CCCGGCGGTA 60 

GCAGGAGCCT GGACCCAGGC GCOGGCGGOG GGOGTGAGGC GCOGGAGCCC GGCCTOGAGG 120 

TGCATACCGG ACCCCCATTC GCATCTAACA AGGAATCTGC GCCCCAGAGA GTCCOGGAOG 180 

CCGCXX3GT06 G10CCGG6GG OGCCGGGCCA TGCAGCGAOG GCCGCCGCGG AGCTCOGAGC 240 

AGCGGTAGCX5 CCCCCCTGTA AAGCGGTTOG CTATGCCGGG ACCACTGTGA ACCCTGCCGC 300 

CTGCCGGAAC ACTCTTCGCT CCGGACCAGC TCAGCCTCTG ATAAGCTCGA CTCGGCAOSC 360 

CCGCAACAAG CAODGAGGAG TTAAGAGAGC CGCAA60GCA GGGAAGGCCT 0CCGGCAGG6 420 

GTGGGGGAAA GOGGCOGGTG CAGCGCGGGG ACAGGCACTC GGGCTGGCAC TOGCTGCTAG 480 

GGATGTCGTC CTGGATAAGG TGGCATGGAC CCGCCATGGC GCGGCTCTGG GGCXTCTGCT 540 

GGCTGGTTGT GGGCTTCTGG AGGGCCGCTT TCGCCTGTCC C3VCGTCCTGC RAATGCAGTG 600 

CCTCTCGGAT CTGGTGCAGC GACCCTTCTC CTGGCATCGT GGCATTTCCG AGATTQGAGC 660 

CrAACWSTGT AGATCCTGAG AACATCACCG AAATTTTCAT OCCAAAOCAG AAAAOCTTAfi 720 

AAATCATCAA OGAAGATGAT GTTGAAGCTT ATGTGGGACT GA6AAATCTG ACAATTGTOG 780 

ATTCTGGATT AAAATTTGTG GCTCATAAAG CATTTCTGAA AAACAGCAAC CTGCAGCACA 840 

TCAATTTTAC COGAAACAAA CTGACGAGTT TGTCTAGGAA ACATTTCOC3T CACCTTGACT 900 

TGTCTGAACT GATCCTGGTG GGCAATCCAT- TTACATGCTC CTGTGACATT ATGTGGATCA 960 

AGACTCTCCA AGAGGCTAAA tCCAGTCCAG ACACTCAGGA TfTGT ACPGC CTGAATGAAA 1020 

GCAGCAAGAA TATTCCCCTG GCAAACCTGC AGATACCCAA TTGTGGTTTG CX»TCTGCAA 1080 

ATCTGGCCGC ACCTAACCTC ACTGTGGAGG AAG6AAAGTC TATCACATTA TCCTGTAGTG 1140 

TGGCAGGTGA TCOGGTTCCT AATATGTATT GGGATGTTGG TAAOCTGGTT TCCAAACATA 1200 

TGAATGAAAC AAGCCACACA CAGGGCTCCT TAAGGATAAC TAACATTTCA TCOGATCACA 1260 

GTCGGAAGCA GATCTCTTCT GTGGGGGAAA ATCTTGTAGG AGAAGATCAA GATTCTGTCA 1320 

ACCTCACTGT GCATTTTGCA CCAACTATCA CATTTCTCGA ATCTOCAACC TCAGACCACC 1380 

ACTGGTGCAT TCCATTCACT GTGAAAGGCA ACCCCAAACC AGCGCTTCAG TGGTTCTATA 1440 

ACGGGGCAAT ATTGAATGAG TCCAAATACA TCTGTACTAA AATACATGTT AGCAATCACA 1500 

CGGAGTACCA OGGCTGCXTTC CAGCTG6ATA ATCCCACTCA CATGAACAAT GGGGACTACA 1S60 

CTCTAATAGC CAAGAAT6AG TATGGGAAGG ATGAGAAACA GATTTCTGCT CACTTCATGG 1620 

GCTGGCCTGG AATOACGAT GGTGCAAACC CAAATTATCC TGATGTAATT TATGAAGATT 1680 

ATGGAACTCC AGCGAATGAC ATCGGGGACA CCACGAACAG AAGTAATGAA ATCCCTTCCA 1740 

CAGAOGTCAC TGATAAAACC GGTCGGGAAC ATCTCTCGGT CTATGCTGTG GTGGTGATTG 1800 

CGTCTGTGGT GGGATTTTGC CTTTTGGTAA TGCTGTTTCT GCTTAAGTTG GCAAGACACT 1860 

CCAAGTTTGG CATGAAAGAT TTCTCATGGT TTGGATrTGG GAAAGTAAAA TCAAGACAAG 1920 

GTGTTGGCCC AGCCTCCGTT ATCAGCAAT6 ATGATGACTC TGCCAGCCCA CTCCATCACA 1980 

TCTCCAATGG GAGTAACACT CCATCTTCTT CGGAAGGTGG CCCAGATGCT GTCATTATTG 2040 

GAATGACCAA GATCCCTGTC ATTGAAAATC CCCAGTACTT TGGCATCAC!C AACAGTCAGC 2100 

TCAAGCCAGA CACATTTGTT CAGCACATCA AGCGACATAA CATTGTTCTG AAAAGGGAGC 2160 

TAGGCGAAGG AGCCTTTGGA AAAGTGTTCC TAGCTGAATG CTATAACCTC TGTCCTGAGC 2220 

AGGACAAOAT CTTGGTGGCA GTGAAGACCC TGAAGGATGC CAGTGACAAT GCAOGCAAGG 2280 

ACTTCCACCG TGAGGC06AG CTCCTGACCA ACCTCCAGCA TGAOCACATC GTCAAGTTCT 2340 

ATGGCGTCTG CGTGGAGGGC GACCCCCTCA TCATGGTCTT TGAGTACATG AAGCATGGGG 2400 

ACCTCAACAA GTTCCTCAGG GCACACGGCC CTGATGCCGT GCTGATGGCT GAGGGCAACC 2460 

CGCCCACGGA ACTGAGGCAG TCGCAGATGC TGCATATAGC CCAGCAGATC GCCGCJGGGCA 2520 

TGGTCTACCT GGCGTCCCAG CACTTCGTGC ACOGCGATTT GGCCACCAGG AACTGCCTGG 2580 

TCGGGGAGAA CTTGCTGGTG AAAATCGGGG ACTTTGGGAT GTCCOGGGAC GTGTACAGCA 2640 

CTGACTACTA CAGGGTCGGT GGCCACACAA TGCTGCCCAT TOGCTGGATG CCTCCAGAGA 2700 

GCATCATGTA CAGGAAATTC AOSACGGAAA GCGACGTCTG GAGCCTGGGG GTOGTGTTGT 2760 

GGGAGATTTT CACCTATGGC AAACAGCCCT GGTAOCAGCT GTCAAACAAT GAQGTGATAG 2820 

AGTGTATCAC TCAGGGCCGA GTCCTGCAGC GACCCC6CAC GTGCCCCCAG GAGGTGTATG 2880 

AGCTGATGCT GGGGTGCTGG CAGOGAGAGC CCCACATGAG GAAGAACATC AAOGGCATCC 2940 

ATACCCTCCT TCAGAACTTG GCCAAGGCAT CTCCGGTCTA CCTGGACATT CTAGGCTAGG 3000 

GCCCTTrrOC GCAGACOGAT CCTTCCCAAC GTACTCCTCA GAOGGGCTGA GAGGATGAAC 3060 

ATCTTTTAAC TGCOGCTGGA GGCCACCAAG CTGCTCTCCT TCACTCTGAC AGTATTAACA 3120 

TCAAAGACTC OGAGAAGCTC TCGAGGGAAG CAGTGTGTAC TTCTTCATCC ATAGACACAG 3180 

TATTGACTTC TTTTTGGCAT TATCTCTTTC TCTCTTTCCA TCTCCCTTGG TTGTTOCTTT 3240 

TTCTTTTTTT AAATTTTCTT TTTCTTCTTT TTTTTCGTCT TCCCTGCTTC ACGATTCTTA 3300 

C C C iT fCTTT TGAATCAATC TGGCTTCTGC ATTACTATTA ACTCTGCATA GACAAAGGCC 3360 

TTAACAAAOG TAATTTGTTA TATCAGCAGA CACTCCAGTT TGCCCACCAC AACTAACAAT 3420 

GCCTTGTTGT ATTCCTGCCT TTGATGTGGA TGAAAAAAAG GGAAAACAAA TATTTCACTT 3460 

AAACTTTGTC ACTTCTGCTG TACAGATATC GAGAGTTTCT ATGGATTCAC TTCTATTTAT 3540 

TTATTATTAT TACTGTTCTT ATTGTTTTTG GATGGCTTAA GCCTGTGTAT AAAAAAGAAA 3600 

ACTTGTGTTC AATCTGTGAA GCCTTTATCT ATGGGAGATT AAAACCAGAG AGAAAGAAGA 3660 

TTTATTATGA ACCGCAATAT GGGAGGAACA AAGACAACCA CTGGGATCAG CTGGTGTCAG 3720 

TCCCTACTTA GGAAATACTC AGCAACTGTT AGCTGGGAAG AATGTATTOG GC ACCTTCCC 3780 

CTGAGGACCT TTCTGAGGAG TAAAAAGACT ACTGGCCTCT GTGCCATGGA TGATTCTTTT 3840 

CCCATCACCA GAAATGATAG CCTGCAGTAG AGAGCAAAGA TGGCTTOOGT GAGACACAAG 3900 

ATGGCGCATA GTGTGCTCGG ACACAGTTTT GTCTTOGTAG GTTCTGATGA TAGCACTGGT 3960 

TTGTTTCTCA AGCGCTATCC ACAGAACCTT TGTCAACTTC ACTTGAAAAG AGGTGGATTC 4020 
ATGTCCAGAG CTCATTTCGG GGTCAGGTGG GAAAGCC 

Seq ID NO: 597 Protein sequence 
Protein Accession AAL6796S.1 

1 11 21 31 41 51 

I 1 i ] I i 

MSSWIRWHGP AMARLWGFCW LWGFWRAAF ACPTSC3CCSA SRIWCSDPSP GIVAFPRLEP 60 

NSVDPEMITS IPIANQKRLE IINEDDVEAY VGLRNLTIVD SGLKPVAHKA FLIOTSNUJHI 120 

HPTRNKLTSL SRKHFRHLDL SELILVGNPF TCSCDIKWIK TMJEAKSSPD TQDLYCLNES 180 

SKNIPLANLQ IPNCGLPSAN LAAPNLTVEE GKSITLSCSV AGDPVPNJffH DVGm*VSKHM 240 



418 



10 



wo 02/086443 

NETSHTQGSL RITNISSDDS GKQISCVAEN LVGEDQDSVN LTVHFAPTTr PliESPTSDHH 300 

KCIPFIVBCai PKPALQHFYN GAILKBSKYI CTKIHVTKHT EYHGCLQLDJJ PTHKHKGDVT 360 

LZAXBSyGSD EKQISAHEMQ HSGXSDGAHP NYFSViyEDy GTAAKDIGDT TMRSS SIPST 420 

DVTDKltBiai LSVYAVWIA SWGPCI»LV« hSUJtLMSS KPGWCDPSWF GPGKVKSHQG 480 

VGPASVISZJD DDSASPI£HI SKGSKTPSSS EXXSPOAVIIG MTKIPVISI? QYFGITNSQL 540 

KPDTPVQHIK RHNIVLKREL GEGAFGKVFL AECYKIjCPEQ DXILVAVKTL KDASDNARKD 600 

FHREAELLTN LQHEHIVKFY GVCVBC3)PLI MVFBYMKHGD UJKFLRAHGP DAVLKAEGNP 660 

PTELTQSQKL RIAQQIAAGM VYLASQHFVH RDLATBNCLV GENLLVKIGD FGMSRDVYST 720 

DYYRVGGBTM LPISKMPPBS IMYRKFTTES OVWSLGWLW EIFTYGKQPW YQLS2INEVIE 780 
CITQGRVLQR PRTCPQEVYB IJ«!LGCWQRE? HMKKHIKGIH TLLQMIiAKAS PVYLDIUS 



PCT/US02/12476 



15 



20 



25 



30 



35 



40 



Seq ID KO; 598 OKA sequence 
nucleic Acid Accession ft: AB052906 
Coding sequence: 74.. 814 



AAAACCTTGA 
CTCTGGGTCC 
GCTCCT6CTG 
CATCAlCCGTC 
GGATGAAAAG 
CCTGGGGAAG 
GGTGGTGGAC 
GGAACCCCTC 
TGOATCTTGG 
AATGTGGACA 
GGTTCTGGCC 
CTTCTTGAT6 
CTCAGGCACA 
CATCCTCCCC 
AAGCTGATAC 
CCAGCTGCCC 
TGGACCCAAT 
TIICCTAACAT 
TTCTGGCTGA 
GTACTTCTTT 
TAGACTTCAG 
ATAAGAAAAA 
rrCAAATAAA 



11 
I 

GGTGATTCAT 
TTAATGGCAG 
TCCSGGCTGGT 
ATCCCTAAGT 
ACTTTTCTTC 
AAACTAAATG 
ATACTTACAG 
ACCCTGCAGG 
CAGTTCAGTT 
ACGGTTCATC 
ATGTCCTTCC 
GGGATGGACA 
ACCCAACTCA 
TGCTTCATCC 
CAAAAGGCTC 
AOIACCTACG 
AGCTCATTCA 
ATTATGCAAT 
CTAAACAAGA 
GAATGATGAT 
ACCTCTGGGG 
ATTTATATTA 
GAGTTCTATT 



21 
I 

CTTCCAGGCT 
CAGCCGCCX3C 
COOGGGCTGG 
TCAGACCTGG 
ACTATGACTG 
7CACAAG0GC 
AGCAACTGGG 
CCAGGATGTC 
TCGATGGGCA 
CTGGAGCCAG 
ATTA CTTC TC 
GCACCCTGGA 
GGGCCACAGC 
TCCCTGGCAT 
CTGTGAGCAC 
GTGTATGTCC 
CTGCCTTGAT 
TTTCTCTTGG 
TATATCATTT 
CTCTTTCTTG 
ATTCTTT0CX5 
ATGATTGTTT 
TCCCAAAAAA 



31 
I 

CTCCTTCCAT 
TACCAAGATC 
GCGAGCCGAC 
ACCACGGTGG 
TGGCAACAAG 
CTGGAAAGCA 
TGACATTCAG 
TTGTGAGCAG 
GATCTTCCTC 
AAAGATGAAA 
AATGGGAGAC 
GCCAAGTGCA 
CACCACCCTC 
CTGAGGACAG 
GGTCTTGATC 
AGTGGCCTCC 
TCCTTTTGCC 
TGCTACCTGA 
TCTTTCTTCT 
CAAATGATAT 
TGTCCTGAAA 
CCTTTAGTAA 
AAAAAAAAAA 



41 
I 

CAAGTCTCTC 
CTTCTGTGCC 
CCTCACTCTC 
TGTGCGGTTC 

acagtcacac 
cagaaccx:ag 

CTOGAGAATT 
AAAGCTGAAG 
CTCTTTGACT 
GAAAAGTGGG 
TGTATAGGAT 
GGAGCAGCAC 
ATOCTTTGCT 
TCCTTTAGAG 
AAACrCGCCC 
AGCAGATCAT 
AACAATTTTA 
TGGAATTCCT 

crrrrfGrri ' 

TGTCAGTAAA 
gagaattttt 
TTTATTGTTC 
AA 



51 
I 

CTCCCTAGOG 

tcccgcttct 

TTTGCTATGA 
AAGGCCAGGT 
CTGTCAGTCC 
TACTGAGAGA 
ACACACCCAA 
GACACAGCAG 
CAGAGAAGAG 
AGAATGACAA 
GGCTTGAGGA 
T06CCATGTC 
GCCTCCTCAT 
TGACAGGTTA 
TTCTGTCTGG 

gatgacatca 
ccagcagtta 
gcacttaaag 
ggaaaatcaa 
ataatcaogt 
aaattattta 
TGTACTGATA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



45 
50 



Seq ID NO: 599 Protein sequence 
Protein Accession BAB61048.1 

1 11 21 31 



41 



51 

1 



MAAAAATKIL LCLPLUtLLS GWSRAGRADP HSLCYDITVI PKFRPGPRWC AVQGQVDEKT 

PLHYDoancr vtpvsplgkk lnvttawkaq npvlrewdi lteqlrdiql enytpkeplt 

LQARMSCEQK AEGHSSGSHQ PSPDGQIPLL FDSEKHMWTT VHPGARKMKE KWENDKWAM 
SFHYFSMGDC IGHLEDPLMG MDSTLEPSAG APLAMSSGTT QLRATATTLI LCCLLIItiPC 

PILPGI 



fiO 
120 
180 
240 



55 
60 
65 
70 
75 
80 



Seg ID NO: 600 DNA sequence 

Nucleic Acid Accession NM_001898.1 

Coding sequence; 57.. 482 



6GCTCTCACC 
CCCAGTATCT 
6CCCCAAG6A 
AGTGGGTACA 
ACTACTACAG 
ATTACrrCTT 
ACACCTGTGC 
TCTACGAAGT 
AGGGATCT6T 
CCACOCCTGG 
GACA6ACAGA 
CTTOCTTCTT 
AAACAGTAGC 



11 

I 

CTCCTCTCCT 
GAGTACCCTG 
GGAG6ATAGG 
GCGTGCCCTT 
ACGTCCGCTG 
CGACGTAGAG 
CTTCCATGAA 
TCCCTGGGAG 
GCCAGGCCAT 
ACTGGTGGCC 
GAAOGCTGCA 
GCTTCTAATA 
ATGGCC 



21 
I 

GCAGCTCCAG 
CTGCTCCTGC 
ATAATCCCGG 
CACTTCGCCA 
CGGGTACTAA 
GTGGGCCGCA 
CAGCCAGAAC 
AACAGAAGGT 
TCGCACCAGC 
OCCACCCTGC 
GGAGTCCTTT 
GCGCTGGTAC 



31 
I 

CTTTGTGCTC 
TGGCCACCCT 
GTGGCATCTA 
TCAGCGAGTA 
GAGCCAGGCA 
CCATATGTAC 
TGCAGAAGAA 
CCCTGGTGAA 
CACCACCCAC 
GGGAGGOCTC 
GTIXjCTCAGC 
ATGGTACACA 



41 
I 

TGCCTCTGAG 
AGCTGTGGCC 
TAAOGCAGAC 
TAACAAGGCC 
ACAGACCGTT 
CAAGTCCCAG 
ACAGTTGTGC 
ATCCAGGTGT 
TCCCACCCCC 
CCCATGTGCC 
AGGGCGCTCT 
CCCCOCCACC 



51 
I 

GAGACCATGG 
CTGGCCTGGA 
CTCAATGATG 
ACCAAAGATG 
GGGGGGGTGA 
CCCAACTTGG 
TCTTTCGAGA 
CAAGAATCCT 
TGTAGTGCTC 
TGOGCCAAGA 
GCCCTCCCTC 
TCCTGCAATT 



Seq ID NO: 601 Protein sequence 
Protein Accession ff: NP_001889.1 

1 11 21 31 41 SI 

I I I 1 I t 

MAOYLSTLLL LLATLAVALA WSPKEEDHII PGGIYMADLN DEWVQHALHP AISBYNKATK 
DDYYRRPLRV LRARQQTVGG VNYFFDVEVG RTICTKSQPN LDTCAPHEQP ELQKKQLCSF 
EIYEVPWENR RSLVXSRCQE S 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



60 
120 



85 



Seq ID NO: 602 DNA sequence 

Nucleic Acid Accession 8: MH_003976.2 

Coding sequence: 299.961 



11 
I 



21 
I 



31 



41 
I 
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CTCTGAGCTT CTCTGAGCCT TGTTTGCTCA TCTGGAAAAA GGGGATTAAA CCATTTACCT 
CATGGAGTTG TGAAAGAATA GCTGCAAAGC ACCtAACACA TAGTARGGTT OCCAGTGCAG 
CTJUmcroC TGGGTTGAGT CrAGCTGIGT AGGCGCCTTG TTCCTCACCT G^GAAACTG 
GG6TOGCAGG COGGTCCCCC ACAAAAGA.TA ACICATCTCT TAATTTGCAA GCTGCCTCAA 
CACGAGGGTG GGGGAACAGC TCAAC3UITGG CTGATGGGCG CTOCTGGTCT TGATACAGAT 
GGAACTTCGA CTTCC5AGGCC TCTCCACGCT GTCCCACTGC CCCTCGCCTA GGCGGCaGCC 
TGCCCTGTGG CCCACCCTGG CCGCTCTGGC TCTGCTGAGC AGCGTOSCAG AGGCCTOCCT 
GGGCTCOGCG CCCOGCaGCC CTGCCCCCCC OGAAGGCCCC CCGCCT6TCC TGGOGTCCCX: 
OGCGGGCCAC CTGCOGGGGG GACGCACGGC COGCTGGTGC AGTGGAAGAG CCCGGCGGCC 
GCOGCCGCAG CCTTCTCGGC COGCGCCCCC GCCGCXTTGCA CCCCCATCTG CTCTTCCCCG 
OGGGGGCCGC GCGGCGOGGG CTGCGGGCCC GGGGAGCOGC GCTCGGGCAfi 06GG0G06CG 
GGGCTGCCGC CTGCGCTCGC AGCTGGTGCC GGTGCGCGCG CTCGGCCTGG GCCACC3GCTC 
OGAOGAGCTG GTGOnTTCC GCTTCTGCAG CGGCTCCTGC CGCCGOGOGC GCTCTCCACA 
OGACCTCAGC CTGGCCAGCC TACTGGGOGC OGQGGCCCTG OGACOGCCOC CGGGCTCCCG 
GCCOGTCAGC CAGCCCTGCT GCCGACCCAC GOGCTACGAA GCGGTCTCCT TCATGGACGT 
C3VACAGCACC TG6AGAACCG TGGACOGCCT CTCOGCCACC GC CTGOGGCT GCCTCGGCTG 
AGGGCTCGCT CCAGGGCTTT GCAGACTGGA CCCTTACOGG TGGCTCTTOC TGOCTGGGAC 

ccTccoGCAG agtcxx:acta gccagcggcc tcagccaggg acgaaggcct caaagctgag 
AGGCCCCTAC cggtcggtga tggatatcat ccccgaacag gtgaagggac aactgactag 
cagccccaga gccctcaccc tgcggatccc agcctaaaag acaccagaga cctcagctat 
ggagcocptc ggacccactt ctcacagact ctggcactgg ccaggcctcg aacctgggac 
CCCTCCrCIG atgaacacta cagtggctga ggcatcagcc cccgcccagg ccctgtaggg 
acagcatttc aaggacacat attgcagttg cttggttgaa agtgcctgtg ctggaactgg 

CCTGTACTCA CTCATGGGAG CTGGCCCC 

Seq ID HO: 603 Protein sequence 

Protein Accession S: MP_003967.l . . 

X 11 21 31 fl 51 

I I i I 1 I 

MEIiGLGGIiST LSHCPWPRRQ PAI/WPTLAAL ALLSSVAEAS LGSAPRSPAP REGPPPVLAS 

PAGHIiPGGRT ARWCSGHARR PPPQPSRPAP PPPAPPSALP RGGRAARAGG PGSRARAAGA 

RGCRLRSQLV PVRALGLGHR SDELVRFRFC SGSCRRARSP HDIiSLASLLG AGAUIPPPGS 

RPVSQPCCRP TRYEAVSPMD VNSTWRTVDR LSATA0GCU3 

Seq ID NO: 604 DMA sequence 
Kucleic Acid Accession #: NM_0S7091.l 
Coding sequence: 7 8 3.. 14 45 



PCTAJS02/12476 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



ACTGGCCGCT 
GGACCCCCAA 
TCGCTCCCCG 
CGCGTGTCTA 
CTCCATATCC 
CAAGCTAGGG 
CGGGGCAGGG 
CACOGGACGG 
CAGACAAGGC 
CCCGGGCCTG 
TAAAAGAGGC 
GCCCAGCACT 
TCAACAGGAG 
AGATGGAACT 
AGOCTGCCCT 
COCTGGGCTC 
CXXXOGCCGG 
G6COGOCX3CC 
CC060GG6GG 
OGCGGGGCTG 
GCTCCGACGA 
CACACGACCT 
CCOGGCCCGT 
ACGTCAACAG 
GCTGAGGGCT 
GGACCCTCXX: 
T6A6AGG00C 
CTAGCAGCCC 
CTATGGAGCC 
GGACCCCTCC 
AGGGACAGCA 
CTGGCCTGTA 



11 
I 

GAGAGAAGAA 
ATCTGCACGT 
CCCTCACTCA 
CAAACTCAAC 
GAGGGGCCCC 
GGGACTGGAT 
GCGCTCCCAG 
CTGCGGCGGC 
CCGGGGGCTC 
GAGCCCCACA 
ACTGCCAGGT 
GGTCCCCGGA 
GGTGGGGGAA 
TGGACTTGGA 
GTGGCCCACC 
CGCGCCCCGC 
CCACCTGCCG 
GCAGCCTTCT 
CGGOQCGGCG 
COGCCTGCGC 
GCTGGTGCX3T 
CAGCCTGGCC 
CAGCCAGCCC 
CACCTGGAGA 
CGCTCCAGGG 
GCAGAGTCCC 
CTA0C0GT6G 
CAGAGCCCTC 
CTTCGGACCC 
TCTGATGAAC 
TTTGAAGGAC 
CTCACTCATG 



21 
I 

TGGG6TG6AG 
ACCAGCAGTC 
CTTTCTCCOG 
TCCCGGTTTC 
TCCCAGCATC 
CCGACG6GTG 
CCCCACCC0C3 
GGGCAGGAGG 
CGCCAGCAGC 
CCCGAGGGTG 
GTACAGTCXTT 
AAGGTGCCTA 
CAGCTCAACA 
GGCCTCTOCA 
CTGGCCGCTC 
AGCCCTGCCC 
GGGGGACGCA 
OGGCCCGCGC 
CGGGCTGGGG 
TOGCAGCTGG 
TTOOGCTTCr 
AGCCTACTGG 
TGCTGCCGAC 
ACCGTGGACC 
CTTTGCAGAC 
ACTAGCCAGC 
GTGATGGATA 
ACCCTGCGGA 
ACTTCTCACA 
ACTACAGTGG 
ACATATTGCA 
GGAGCTGGCC 



31 
I 

CAGAGAGCA6 
AGOXCCCCA 

cxxrrcxsGccc 

CGTGCCTCTC 
TACCCCCCTC 
GAGCAGCCAG 
GGATCTGGTG 
CTGCTGAGGG 
AGGTCCCTCG 
CAGACTGGCT 
GGGCATGCGC 
GAAGAACAAG 
ATGGCTGATG 

cxxrrGTCCCA 

TCGCTCTGCT 
CC06CGAAGG 
CGGCCCGCTG 
CCCCGCCGCC 
GCCCGGGCAG 
TGCCGGTGCG 
GCA6CGGCTC 
GGGC0GGG6C 
CCACGCGCTA 
GCCTCTCCGC 
TGGACCCTTA 
GGCCTCAGCC 
TCATCXXCGA 
TCCCAGCCTA 
GACTCTGGCA 
CTGAGGCATC 
GTTGCTTGGT 
CC 



41 
I 

CTGCTGCAGG 
CGCAGGGACC 
GGCCTCCCAG 
CACCX5CTCGA 
CCAACCTCGG 
GTGAGCCCCG 
ACGCTGGGGC 
ATG6AGTTGG 
GGCCCCAGCC 
GCCAAGGCCA 
TGTTTGAGCT 
GTGCAGGACC 
GGCGCTCCTG 
CTGCCCCTGG 
6AGCAGC6TC 
CCCCCCGCCT 
GTGCAGTGGA 
TGCACCCCCA 
CCGCGCTCGG 
CGCGCTCGGC 
CTGCOjCCGC 
CCTGCGACG6 
CGAAGCGGTC 
CACCGCCTGC 
CCGGTGGCTC 
AGGGAGGAAG 
ACAGGTGAAG 
AAAGACACCA 
CTGGCCAGGC 
AGCCCCCGCC 
TGAAAGTGCC 



51 

! 

GCAGACAGCC 
GGCTTACCCC 
CTCTCTACTT 
GTTCTCTACT 
GGGACCTAGC 
AAAGGTGGGG 
TGGAATTTGA 
GCCOGGCCCC 
CTGGCTGGCA 
CACTTTTGGC 
TOGGGGGAGA 
CCGTGCTGCC 
GTGTTGATAG 
CCTAGGOGGC 
GCAGAG6CCT 
GTCCTGGOGT 
AGAGCCCGGC 
TCTGCTCTTC 
GCAGCGGGGG 
CTGGGCCACC 
GCGCGCTCTC 
CCCCOGGGCT 
TCCTTCATGG 
GGCTGCCTGG 
TTCCTGCCTG 
GCCTCAAAGC 
GGACAACTGA 
GAGACCTCAG 
CTCGAACCT6 
CAGGCCCTGT 
TGTGCTGGAA 



Seq ID NO: 605 Protein sequence 
Protein Accession Ut NP_003967.l 



11 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
lOSO 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1600 
1860 



21 31 41 51 

1 ) i I 1 I 

MELGLGGLST LSHCPWPRRQ PALWPTLAAL ALLSSVAEAS LGSAPRSPAP RBGPPPVLAS 
PAGKLPGGRT ARHCSGRARR PPPQPSRPAP PPPAPPSALP RGGRAARAGG PGSRARAAGA 
RGCRLRSQLV PVRALGLGHR SDELVRPRPC SGSCRRARSP HDLSLASLW5 AGALRPPPGS 
RPVSQPCCRP TRyEAVSF^a> VNSTWRTVDR LSATACGCLG 



Seq ID NO: 606 DNA sequence 

Nucleic Acid Accession »; NM_057160.1 



60 
120 
180 



420 



wo 02/086443 

Coding sequence: 1..7X4 

1 11 21 31 41 SI 

ATCCCCGGCC TGATCTCAGC CXXSAGGACAG OCCCTCCTTG AGGTCCTTCC TCCCCAAGCC 60 

CACCTGGGTC CCCTCTTTCT CCCTGAGGCT CCACTTGGTC TCTCOGCGCA GCCT GCCCTG 120 

TGGCCCACCC TGGCOGCTCT GGCTCTGCTG AGCAGOGTCG CAGAGGCCTC CCTGGGCTCC IBO 

GOSCCCCGCA GCCCTGCCCC COGCtSAAGCX: G0C0CQCCT6 TCCTGGOGTC CCCCGCOGGC 240 

CACX:TGCXX3G GGGGACGCAC GGCCOGCTGG T6CAGTOGAA GAGCOCGGOG CCCGCCGCOG 300 

CAGCCTTCTC C3GCCCGCGCC CCCGCCGCCT GCACCCCCAT CTGCTCTTCC CCGCGGGGGC 3fiO 

CGCX3CGGCGC GGGCTGGGGG CCCGGGCAGC OGOGCTOQGG CAjGOGGGQGC GCGGGGCIGC 420 

OGCCTGCGCT CGCAGCTGGT GCCGGTGOGC GOCSCTCGGCX: TCGGCCACOG CTCCGAOGAG 480 

CTGGTGOGTT TCOariTC T G CAGCGGCTCC TGCCSCOGCG CGCXSCTCTCC ACACGACCTC 540 

AGCCTGGCCA GCCTACTGGG CGCCGGGGCC CTGCGACCGC CCCCGGGCTC CCGGCCCGTC 600 

AGCCAGCCCT GCTGCXGACC CACGCGCTAC GAAGOSGTCT CCTTCATGGA CGTCAACRGC 660 

ACCTGGAGAA CCGTC3GACCG CCTCTCCGCC ACOGCCTGOG GCTGCCTGGG CTGAGGGCTC 720 

GCTCCAGGGC TTTGCAGACT GGACCCTTAC CGGTGGCTCT TCCTGCCTGG GACCCTCCCG 780 

CAGAGTCCCA CTAGCCAGCG GCCTCAGCCA GGGAOGAAGG CCTCAAAGCT GAGAGGCCCC 840 

TACOGGTGGG TGATCGATAT CATCCCCGAA CAGGTGAAGG GACAACTGAC TMXAGCCCC 900 

AGAGCCCTCA CCCTGCGGAT CCCAGCCTAA AAGACACCAC AGACCTCAGC TATGGAGCCC 960 

TTCOSACCCA CTTCTCACRC ACTCTGGCAC TGGOC»GGCG TOGAACCTGG GACCCCTCCT 1020 

CTGATGAACA CTACAGTGGC TGAGGCATCA GOCOOCGCOC AGGCCCTGTA GGGACAGCAT 1080 

TTGAAGGACA CATATTGCAG TTGCTTQGTT GAAAGTCCCT GTGCTGGAAC TGGCXnXSTAC 1140 
TCACTCATGG GAGCTGGCCC C 



Seq ID NO: 607 Protein sequence 
Protein Accession fti NP_476S01.1 

1 11 21 

i ) I 

MPGLISARGQ PLLEVLPPQA HIiGAI*FIiPEA 
APRSPAPREG PPPVLASPAG HIjPGGRTARH 
SAARAGGPGS SASAAGARGC RLRSQIiVPVR 
SLASLLQAGA LRPPPGSRPV SQPCCRPTRY 



31 41 SI 

I 1 I 

PLGLSAQPAL WPTLAALALL SSVAEASLGS 60 
CSGRARRPPP QPSRPAPPPP APPSAURGG 120 
ALGLGHRSTE LVRPRPCSGS CRRARSPHDL 180 
EAVSFMDVNS TtnCTVDRLSA TAOGCLG 



Seq ID NO: 608 DNA sequence 

Nucleic Acid Accession ft: IIM_057090»1 

Coding sequence ; 2 9 . . 71 5 

1 11 21 31 41 SI 

CTGATGGGCG CTCCTGGTGT TGATAGAGAT G6AACTTGGA CTTGGAGGCX: TCTCXaiOGCT 60 

GTCCCACTGC CCCTCGCCTA GGOGGCAGGC TCCACTTGGT CTCTCOSCGC AGCCTGOCCT 120 

GTGGCCCACC CTGGCCGCTC TCGCTCTGCT GAGCAGCGTC GCAGAGGCCT CCXTrGGGCTC 180 

CGCGCCCCGC AGCCCTGCXX: CCCGCGAAGG CCCGCCGCCT GTCCTGGCGT CCCCCGCOGG 240 

CCACCTGCCG GGGGGACGCA CGGCCCGCTG GTGCAGTGGA AGAGCCCGGC GGCCGCCGCC 300 

GCAGCCTTCT CGGCCCGCGC CCCCGCOGCC TGCACCCCCA TCTGCTCTTC CCCGCGGGGG 360 

CCGCGCGGCG C6GGCTGGGG GCCCGGGCAG CCGCGCTCGG GCAGCGGGGG CGCG6GGCTG 420 

CCGCCTCCGC TCGCAGCTGG TGCCGGTGOG CGCGCTCGGC CTGGGCCACC GCTCCGAOGA 480 

GCTGGTGCGT TTCCGCTTCT GCAGCGGCTC CTGCCGCC6C GOGOGCTCTC CACAOCSACCr 540 

CAGCCTGGCC AGCCTACTGG GCGCOGGGGC CCTGCGACOG CCCOOGGGCT CCCGGCCOCT 600 

CAGCCAGCCC TGCTGCOGAC CCACGCGCTA CGAAGCGGTC TCCTTCATGG ACGTCAACAG 660 

CACCTGGAGA ACCGTGGACC GCCTCTCCGC CACCGCCTGC GGCTGCCTGG GCTGAGGGCT 720 

OGCTCCAGGG CTTTGCAGAC TGGACCCTTA CCGGTGGCTC TTCCTGCCTG GGACCCTCCC 780 

GCAGAGTCCC ACTAGCCAGC GGCCTCAGCC AGGGACGAAG GCCTCAAAGC TGAGAGGCCC 840 

CTACCGGTGG GTGATGGATA TCATCCXXX5A ACAGGTGAAG GGACAACTGA CTAGCACCCC 900 

CAGAGCCCTC ACCCTGCGGA TCCCAGCCTA AAAGACACCA GAGACCTCAG CTATGGAGCC 960 

CTTCGGACCC ACTTCTCACA 6ACTCTGGCA CTGGCCAGGC CTCGAACCTG GGACCCCTCC 1020 

TCTGATGAAC ACTACAGTGG CTGAGGCATC AGCCCCOGCC CAGGCCCTGT AGGGACAGCA 1080 

TTTGAAGGAC ACATATTGCA GTTGCTTGGT TGAAAGTGCC TGTGCTGOAA CTGGCCTGTA 1140 
CTCACTCATG GGAGCTGGCC CC 



Seq ID NO: 609 Protein sequence 
Protein Accession #! NP_47643l.l 



1 11 21 31 41 SI 

MELGLGGLST LSHCPWPRRQ APLGLSAQPA LWPTLAALAL XiSSVAEASLG SAPRSPAFRE 60 

GPPPVLASPA GHIiPGGRTAR WCSGRARRPP PQPSRPAPPP PAPPSALPRG GRAARAGGPG 120 

SKARAAGARG CSLRSQLVFV RALGLGKRSD ELVRFKFCS6 SCRRASSPHD LSLASLIXSAG 180 
ALSPPPGSRP VSQPCCRPTR YEAVSFMDVH STWRTVDRLS ATAOGCZiO 



Seq ID NO: 610 DNA sequence 

Nucleic Acid Accession ft: Eos sequence 

Ooding sequence: 1..1746 



1 11 21 

1 I I 

ATGCCACTGA AGCATTATCT CCTTTTGCTG 
GCCTACCATG GCTGCCCTAG CGACTGTACC 
GGGGGACGCA TTGTGGOGGT GCCCACCCCT 
CTCAACACGC ACATCACTGA ACTCAATGAG 
GCCCTGAGGA TTGAGAAGAA TGAGCTGTOG 
GGCTCGCTGC GCTATCTCAG CCTCGCCAAC 
TTCCAGGGCC TQGACAGCCT TGAGTCTCTC 
CAGCGGGCCC ACTTCTCCCA GTGCAGCAAC 
CTGGAATACA TOCCXGAOGG AGCCTTCGAC 



31 41 51 

i I I 

GTGGGCTGCC AAGCCTGGGG TGCAGGGTTG 60 

TGCTCCAGGG CCTCCCAGGT GGAGTGCACC 120 

CTGCCCTGGA AOGCCATGAG CCTGCAGATC 180 

TCCCOGTTCC TCAATATCTC AGCCCTCATC 240 

CGCATCACGC CTGGGGCCTT CCGAAACCTG 300 

AACAAGCTGC AGGTTCTGCC CATCGGCCTC 360 

CTTCTGTOCA GTAACCAGCT GTTGCAGATC 420 

CTCAAGGAGC TGCAGTTGCA OGGCAACCAC 480 

CACCTGGTAG GACTCACGAA GCTCAATCTG 540 



421 



wo 02/086443 

GGCAAGAATA GCCTCACCCA CATCKaCCC 
CTCCTCOGGC TGTATGAGAA CAGGCTCAOG 
GTTAACCTGC AGGAACTGGC TCTACAGCAG 
rrcCACAACA ACCACAACCr CCAGAGACTC 
5 CCACCCAGCA TCPTCATGC3V GCTGCCCCAG 
CTGAAGGAGC TCTCTCTGGG GATCTTCGGG 
TATGACAACC ACATCTCTTC TCTACCCGAC 
GTCCTGATTC TTAGCOGCAA TCAGATCAGC 
ACGGAGCTTC GGQAGCTGTC CCTCCACACC 
10 TTCOGCATGT TGGCCAACCT GCAGAACATC 
CCAGGGAATA TCTTOGCCAA CGTCAATGGC 
CTGGAGAACT TGCCCCTCCSG CATCTTOGAT 
TATGACAATC CCTGGAGGTG TGACTCAGAC 
AACCAGCCTA GGTTAGG6AC GGACACTGTA 
15 GGCCAGTCCC TCATTATCAT CAATGTCAAC 
GTGCCTAGTT ACCCAGAAAC ACCATGGTAC 
TC06TCTCTT CTACCACTGA GCT AACCAGC 
ATTCAGGTCA CTGATGACGG CAGCGTTTGG 
ATTCCCGCCA TTGTAATTGG CATTGTCGCC 
20 TGTTGCTGCT GCAAGAAGAG GAGCCAAGCT 
TGTTAAAGAG GCAGGCTGGA GCAGGGCTGG 
TCATCTTTCT GCCTCCACCC CTGGGTCCAT 
CTAGATAAAG GTGTGCCTAC CTCTTCCTGA 
GGTGCOGGAC CTTCCTACAA TCAGGAAGAT 
25 GGATTTCCGA TTCATACCCC TGGGCTTCCT 
ACCTGTCCTC CAAGAACAGC CTTCCCTGOG 
AGTTAGTCCA CAGCCTGCTC ACTTCGTGGG 
CCTAAGTATT ATGTAAGTTG ATTTCCCTTC 
ACCCAGCATG TCCCCTCAAA TGAAAGTTCT 
30 TGAGTTCTCT CCTCAAAGAA GACTTCAAAC 
CAGCCTGGTT TTGGGGATGC TATGAAAGAG 
AGACAGAAGA GCOGTCATCA G TGTCTC ACT 
CCCCAGCACA GCAAGCTCAG CCTTTTAGAG 
TGAAAAGTTT AGCCCTTTAA GGAATGAAAT 
35 AAAATCAGCr TATTAATACX3 GGATAGAGAA 
CACCCCTAGA GTTTGTTTTA AAATTTTTAA 
GTGGGAACAT GATAGTGTAT GGCTTGGTGG 
CAGCATCTAG ACXCAGACX:C AGAGCATCAC 
GGAGATGGGG GCTTCTGAAG ATGGACTTAC 
40 TCCCCCCACA GTCAGCCTGT GCAAAGGCCC 
TGTGGACAGG ATGGGAGACT GTGGCCTGAA 
AGAGACCCTG AGACCTGGGG CACCATGGCT 
GTCCXSTGCAG CCACACCCTC TTCCCTGCCA 
TCC6CCTGGA GOCTTCTATG GACGTGATAT 
45 ACTTAGGGGA AGT6AAATCG CTCAGAGATG 
GAATCTAGTG TCTTTCTAAT GTGGTAAAAT 
TGAACTTCAG AATCTCACTT ACM7CAGGG6 
GTCTGGGGGC TCCCTGGAGC TCCTCCTGCG 
TCCAGGGTTA TTCTCCTCXTr CGAGTCACAG 
50 CTGCTATACA CATATTCACA TGGCGCTCAA 
CTCTGGACAA CTGGCCCAGT TTACAGTGAA 
AGGAAAGAAC TTCAGCTGAC TCCAC6GGGA 
TCTTATTAGC TCCCCGCTCC ACAAGACACC 
TCGGCTCTTA TTAGCTCCCC GCTCCACAAG 
55 CCCGATCGGC TCTTATTAGC TCCCCGCTCC 
CAGGAGCACG TGCTGACCAG TTTTCCCTTC 
TGTTTGCAAA CACTAGTGCA CTTTGTAGCT 
AGATGAGGCC CGTCAGAfiTC AAGAGATGTC 
ACTATTGGTG GCAOCTGQAG GACATGCACC 
60 CCAGAGCATG GCACATGAGC ATCACCCGCT 
GGGGCATCCC GGCCCGTACC CCTCCAGACA 
GGTGCTCCTG TGAGTGGCCT CCAGATGTCT 
GAGGGAGGTG GGAAACCTCA TCATCCGGTG 
GGTATTCCTG GCAGTAGCCA TGACATTGGA 
65 GAGGGCCACT GTCCTCAGAT GACACCACXX: 
CCTTATGTGA ACCTCTTGCC TCTTCCTTTC 
GCCTCCTTTT CTTCAGCGGG CCCTTCAACC 
TACTAGAAAA GCTGAGTGGA GTCTCCTTTC 
GGGCTGGAAT GAGCCGGCTG GTCCCCCAGA 
70 CCTCTCTGTT TACAGCTCCT TGACAGTCCC 
GTGTTGGAGA AGAAACAACA AAAGCCAATT 
TGCACAGATA CTCTTCAAGC ACTGGAOGTG 
GGTAGGAGTG OCGCCTCTAC CCACTT6TGA 
GGTGTTCAAT AGGCTGGGAG TTTTATTTAT 
75 TTGTCTTGGG CTTTOGTCAT TAAACCAAAG 
TTAGTCTTGG TCATCAGAAC CTCACTTGGT 
GGAAAAAATA AACTCTTCCA TCCCTTAAAG 
TGGGCTGTAT GTATATTGTT CTTCCTCCTT 
AACTTTTCAT GGACACAATT TCCACAACCT 
80 GAACTTCCAA ACTCAGGAAG TTTGCAGAGA 
AGTTGGTCGA CAGATGTTAG ATGTATCCTA 
GCCCCCAGAT CCCACAGTCA GAACTGAATC 
GGAAGGAAGC CATGGCTGTG GTTCAGAGAG 
CTCCTTCCGC CCCAGGTTTC TTCTTCTCTT 
85 CTTCATGCTG CCTTCAAAGC TAGATCATGT 
GCCCCAGTGC TTGGGGATGC AtTTACAGAT 
AGCCCTGGTG GGCAGGGTTG GGGGGTCTGT 



AGQGTCTTCC AGCAOCTGGG CAATCTCCAG 600 

GATATCCCCA TGGGCACTTT T6ATGGGCTT 660 

AACCAGATTG GACTGCTCrC COfitSGTCTC 720 

TACCTGTCCA ACAACCACAT CTCCCAGCTG 780 

CrCAACCGTC TTACTCTCTT TGGGAATTCC 840 

CCCATGCCCA ACCTGCGGGA GCTTTGGCTC 900 

AATGTCTTCA GCAACCTCOG CCAGTTGCAG 960 

TTCATCTCCC CUGbTGOriT CAACGGGCTA 1020 

AACGCACTGC AGGACCTGGA CGGGAATGTC 1080 

TCCCTGCAGA ACAATOGCCT CAGACAGCTC 1140 

CrCATGGCCA TCCAGCTGCA GAACAACCAG 1200 

CACCTGGGGA AACTGTGTGA GCTGOGGCTG 1260 

ATCCTTCOGC TCOGCAACTG GCTCCTGCTC 1320 

CCiiriG'f G TT TCAGCCCAGC CAATGTCCGA 1380 

GTTGCTGTTC CAAGCGTCCA TGTCCCTGAG 1440 

CCAGACACAC CCAGTTACCC TGACACCACA 1500 

CCTGTGGAAG ACTACACTGA TCTGACTACC 1560 

GGCATGACCC AGGCCCAGAG OGGGCTGGCC 1620 

CIGGCCTGCT CCCTGGCTGC CTGCGTCGGC 1680 

GTCCTGATGC AGATGAAGGC ACCCAATGAG 1740 

GGAATGATGG GACTGGAGGA CCTGGGAAT T 1800 

GGAGCTTTCC CGTGATTGCT CTTTCTGGCC I860 

CTTGCCTGAT TCTCCOGTAG AGAAGCAGGT 1920 

AGATCCAACT GGCCATGGCA AAAGCCCTGG 1980 

TCGAGAGGGC TCTTCCTCCA AATCCTCCCC 2040 

CCCAGGCCCC CTCOGGGCCT CTGTAGACTC 2100 

AATAGTTCTC CGCTGAGATA GCCCCT CTOG 2160 

TTTTGTTTCT CTTGTTTGTG CTATGGCTTG 2220 

CCCCTTGATT TTCTGCTCCT GAAGGCAGGG 2280 

CATTTAACTG GTTTCTTAAG AGCOGTCAAT 2340 

AGAAGGAAAA TCATGCCGCT CAGTTCCTGG 2400 

TGTGATTTTT ATCTGGAAAA GGAAGAAACA 2460 

AAGGATATTT CCAAACTGCA AACTTTGCTT 2520 

CATGTAGAAT TTTGGACTTC TAAAAACATT 2580 

AGAAATCTGG TGCCTGGGGG TCCCTGTGTT 2640 

TTGAAGCATG TGAAGTGTAC STGCAGAAAA 2700 

ATTTTCACAA ACTGAACATA CCTGTGTAAT 2760 

AAATATCCCC CATCCTGGGC TTTTCCCAGA 2820 

CTGGGACCTG COCCCCATGA GCCAGGACGG 2880 

0GTGGCCAG6 GGTGGAGGAG AATATGTGGG 2940 

CAGGAGATTT TATTATATCT GGAGACCCTG 3000 

GGCCAGGTCA GAAGCATCCT GACTGCAGAG 3060 

GCAAGTTGTC TGCGGCTCAT CGGAGGCCCC 3120 

GCCTGTATCT GTTTTTAATT TTCATTCTTC 3180 

AGATCCTTTA ATTGAAAACG AAGTGTAACG 3240 

TCTCCATCAA CATCACAGTC AGCTGGCAGC 3300 

ACACGGG66T ACACCGATGG GTCAC ACTG G 3360 

TGTGGTCTGG TTAGGAGTTG AGTTGTTTGC 3420 

TCACACGAAT ACCTGCCTTC TCTGGC TTTC 3480 

GAAGTTAGGC TCATGGCAAC GTGTGTCTTT 3540 

ATGGAGAATT TCAGGTCTCC ACGTCTGCCC 3600 

TCTGGAAATC CAOGACCAAT CCCGATCGGC 3660 

TGTGCTTTGG AAATCCACCA CCAATCCOGA 3720 

ACACCTGTGA TCTGGAAATC TACCACCAAT 3780 

ACAAGACACC TGTGACATCC TCCAGGGCCA 3840 

CAGTTCCTGC ACAAAAAGTG TCGAGAGGGC 3900 

TTTCACCCTC TGTCCCAGGG AATCTAGGAG 3960 

ATCCCCCCAG GGTCTCCAAG GCATTTCCAC 4020 

AAGGCTTGCC AGAGCCAACA GGAAGTGAGC 4080 

GATGGTGGCC TGCTGTGCCT GGTGCCAACA 4140 

GGAAGCATGG GTTTGCCCAC AGACCTGTOG 4200 

TTGTGCATAG GCACAAGTGG GCCAGGGCTG 4260 

GGCCCTGCCA ATCTTAACCC AGAACCCTTA 4320 

GCACCTTCCT CTCCAGCCAG AGGCTGACCT 4380 

AGGAGCACCC TAGGTGAGGG GTGAGGGCCC 4440 

TCCCATCAGA GTOGTTGGAT GGAGCCATTG 4500 

TCTCTGCACC ATGTTGTCTG GCTGAGGAGC 4560 

CAACAGGATG ATGCATTTGC TCAAT TCTCA 4620 

AAGCTGGAGT GGGGTACAGA GTTCAGTTTT 4680 

ACGCCCATCT GGAGTGGGAG CTGGGAGTTA 4740 

AGAACCACTA TTTTTAAAAA GTGCTTACTG 4800 

GATTCTCTCT CTAGCCCTCA GCACCCCTGC 4860 

TGGGGTACAG AGGCACTTGC TCTTCTGCAT 4920 

CTCTTCAAAC TTTGTACAA6 AGCTCAT06C 4980 

GAAATGGAAG CCATTCCCCT GTTQCTCTCC 5040 

ACCATATAGA TCAAAAGCTT TGTAACCACA SlOO 

AATAGAATAG TTTGTCCCTC TCATGGGAAT 5160 

AGAATTTAGA GATACAAGAG TTCTACTTAG 5220 

TTCAGATGCT GATGTAGAGC TATTGGGAAA 5280 

GCAGACAGCT AGA6ATAACT CGGGACCCAG S3 40 
GCTTTTAGCC ATAAACCACP CAAAGATTCA 5400 

TGCGTTGTTG GGAAGCCAGC AGIGGCCTTG 5460 
GGTGGGCTGG CAAGCCACTT CCGGGGAAAA 5520 
AAGGA6AGAT TGTTCTCACC AACCCGCTGC 5580 
TTGCCTTGCT TAGAGAATTA CTGCAAATCA 5640 
TTCTAGGCCC TCAGGGrTTT GTAGAGTGTG 5700 
CTTCXCCTGG ATGCTGCTTG TAATCCATTT 5760 



PCT/US02/12476 



422 



wo 02/086443 

GGTOTACAGA ATCAACAATA AATAATATAC ATGTAT 

Seq ID NO: fill Protein sequence 
Protein Accession 8: BAB84S87.1 

1 11 21 31 41 51 

MPLKHYLLLL VGOQAKGAGL AYHGCPSECT CSSASQVECT GARIVAVPTP LPKHAMSUJI 60 

LSTHITELNB SPFLNISALI ALRXEKNBLS RITPGAFRNL GSLRYLSIAN lHaXJVLPIGL 120 

FQGLDSLESL LLSSHOhUQl QPAHFSQCSN LKELQUIGNH LEYIPDGAPD HLVGLTKUJIi 180 

GKNSLTHISP RVFQHIX3KUJ VLRLYENRLT DIP^a3TFDGL VNLQELAUX) KQIGLLSPGLr 240 

FHNNHNLQRL YLSKNHISQL PPSIFMQLPQ LNRLTUGNS LKBLSI/3IFG PMPNLRELWb 300 

YDNHISSLPD NVFSNLRQliQ VLILSRNOIS FISPGAFNGL TELRELSLHT KALQDUJGNV 360 

PHMLANLQNI SLQNNRLRQL PGNIFANVNG LMAIQLQNNQ LENIiPI/SIFD HLGKLCELRL 420 

VDNPWRCDSD ILPLRNWLLL HQPRLGTDTV PVCFSPANVR GQSLIItNVN VAVPSVHVPE 480 

VPSYPETPWY PDTPSYPDTT SVSSTTELTS PVEDYTDLTT IQVTDORSVW GMTQAQSGLA 540 
lAAIVIGIVA LACSLAACVG CCCCiCKRSQA VLKQMKAPUE C 



Seq ID NO: 612 DNA sequence 
nucleic Acid Accession 8: XM_098151 
Coding sequence: 1..447 

1 11 21 31 41 51 

ATGATCCATT TGCTCAATTC TCAGGGCTGG AATGAGCOQG CTGGTCCCCC AGAAAGCTGG 60 

AGTGGGGTAC AGAGTTCAGT TTTCCTCTCT GTTTACAGCT CCTTGACAGT CCCACGCCCA 120 

TCTGGACTGG GAGCTGGGAG TCAGTGTTOG AGAAGAAACA ACAAAAGCCA ATTAGAACCA ISO 

' CTATTTTTAA AAAGTGCTTA CTGTGCACAG ATACTCTTCA AGCACI GGAC GtGGATTCTC 240 

TCTCTAGCCC TCAGCACCCC TGOGGTAGGA GTGCOGCCTC TA CCCAC TTG TG ATK^ TA 300 

CAGAGGCACT TGCTCTTCTG CATGGTGTTC AATAGGCTGG GAGTTTTATT TATCTCTTCA 360 

AACTTTGTAC AAGAGCTCAT GGCTTGTCTT GGGCTTTCGT CATTAAACCA AAGGAAATGG 420 
AA6CCATTCC CCTCrrGCTC TCCTTAG 



Seq ID NO: 613 Protein sequence 
Protein Accession ft: XP_0981S1 

1 11 21 31 41 51 

MMHLLNSQGW NEPAGPPESW SGVQSSVFIiS VYSSLTVPHP SGVGACSQCW RRHNKSQLEP 60 

LPLKSAYCAQ ILPKHWTWIL SLALSTPAVG VPPLPTCDGV QRHLLPCKVP NRLGVLFISS 120 

NFVQELMACL GLSSLNQRKW KPFPCCSP 

Seq ID NO: 614 DNA sequence 

Nucleic Acid Accession # : NM_0026S8 . 1 

coding sequence: 77.. 1372 

1 11 21 31 41 SI 

GTCCCCGCAG CGCCGTCGCG CCCTCCTGCC GCAGGCCACC GAGGCCGCCG CCGTCTAGCG 60 

CCCOGACCTC GCCACCATGA GAGCCCTGCT GGCGOGCCTG CTTCTCTGOG TCCTGGTCGT 120 

GAGCGACTCC AAAGGCAGCA ATGAACTTCA TCAAGTTCCA TCGAACTGTG ACTGTCTAAA 180 

TGGAGGAACA TCTGTGTCCA ACAAGTACTT CTCCAACATT CACTGGTGCA ACTGCCCAAA 240 

GAAATTOGGA GG6CAGCACT GTGAAATAGA TAAQTCAAAA ACCTGCTATG AGGGGAATGG 300 

TCACTTTTAC CGAGGAAAGG CCAGCACTGA CACCATGGGC CGGCCCTGCC TGCCCTGGAA 360 

CTCTGCCACT GTCCTTCAGC AAACGTACCA TGCCCACAGA TCTGATGCTC TTCAGCTGGG 420 

CCTGGGGAAA CATAATTACT GCAGGAACCC AGACAACCGG AGGCXSACCCT GGTGCTATGT 480 

GCAGGTGGGC CTAAAGCCGC TTGTCCAAGA GTGCATGGTG CATGACTGCX5 CAGATGGAAA 540 

AAAGCCCTCC TCTCCTCCAG AAGAATTAAA ATTTCAGTGT GGCCAAAAGA CTCTGAGGCC 600 

CCGCTTTAAG ATTATTGGGG GAGAATTCAC CACCATCGAG AACCA6CCCT GGTTTGCGGC 660 

CATCTACAGG AGGCACOGGG GGGGCTCTGT CACCTACGTG TGXGGAGGCA GOCTCATCAG 720 

CCCTTCCTGG GTGATCAGCG CCACAGACTG CTTCATTGAT TACCCAAAGA AGGAGGACTA 780 

CATCGTCTAC CTGGGTOGCT CAAGGCTTAA CTCCAACACG CAAGGGGAGA TGAAGTTTGA 840 

GGTGGAAAAC CTCATCCTAC ACAAGGACTA CAGOGCTGAC ACGCTTGCTC ACCACAACGA 900 

CATTCCCTTC CTGAAGATCC GTTCCAAGGA GGGCAGGTGT GOGCAGCCAT CCOGGACTAT 960 

ACAQACCATC TGCCTGCCCT CGATGTATAA CGATCCCCAG TTTGGCACAA GCTGTGAGAT 1020 

CACTGGCTTT G6AAAAGAGA ATTCTAOCGA CTATCTCTAT CCGGAGCAGC TGAAAATGAC 1080 

TGTTGTCAAG CTGATTTCCC ACCX3GGAGTG TCAGCAGCCC CACTACTACG GCTCTGAAGT 1140 

CACCACCAAA ATGCTATGTG CTGCTGACCC CC3UVTGGAAA ACAGATTCCT G0CAGG6AGA 1200 

CrCAGGGGGA CCCCTCGTCT GTTCCCTCC3V AGGCCX3CATG ACTTTGACTG GAATTGTGAG 1260 

CTGGGGCOGT GGATGTGCCC TGAAGGACAA GCCAGGCGTC TACAOGAGAG TCTCACACTT 1320 

CTTACCCTGG ATCCGCAGTC ACACCAAGGA AGAGAATGGC CTGGCCCTCT GAGGGTCCCC 1380 

AGGGAGGAAA OGGGCACCAC CCGCTTTCTT GCTGGTTGTC ATTTTTGCAG TAGAGTCATC 1440 

TCCATCAGCT GTAAGAAGAG ACTGGGAAGA TAGGCTCTGC ACAGATGGAT TTGCCTGTGG 1500 

CACCACCAGG GTGAACGACA ATAGCTTTAC CCTCACGGAT AGGCCTGGOT GCIGGCTGCC 1560 

CAGAOCCTCT GGCCAGGATG GAGGGGTGGT CCTGACTCAA CATGTTACTG ACCAGCAACT 1620 

TGTCTTTTTC TGGACTGAAG CCTGCAGGAG TTAAAAAGGG CAGGGCATCT CCTGTGCATG 1680 

GGCTCGAAGG GAGAGCCAGC TCCCCCGACC GGTGGGCATT TGTGAGGCCC ATGGTTGAGA 1740 

AATGAATAAT TTCCCAATTA GGAAGTGTAA GCAGCTGAGG TCTCTTGAGG GAGCTTAGCC 1800 

AATGTCGGA6 CftGCGGTTTG GGGAGCACAG ACACTAAOGA CTTCAGGGCA GGGCTCTGAT 1860 

ATTCCATGAA TGTATCAGGA AATATATATG TGTGTGTATG TTTGCACACT TGTTGTGTGG 1920 

GCTGTGAGTG TAAGTGTGAG TAAGAGCTGG TGTCTGATTG TTAAGTCTAA ATATTTCCTT 1980 

AAACTGTGTG GACTGTGATG CCACACAGAG TGGTCTTTCT GGAGAGGTTA TAGGTCACTC 2040 

CTGGGGCCTC TTGGGTCCCC CACGTGACAG TGCCTGGGAA TGTACTTATT CTGCAGCATG 2100 

ACCTGTGACC AGCACTGTCT CAGTTTCACT TTCACATAGA TGTCCCTTTC TTGGCCAGTT 2160 

ATCCCTTCCT TTTAGCCTAG TTCATCCAAT CCTCACTGGG TGGGGTGAGG ACCACTCCTT 2220 

ACACTGAATA TTTATATTTC ACTATTTTTA TTTATATTTT TGTAATTTTA AATAAAAGTG 2280 
ATCAATAAAA TGT6ATTTTT CTGA 
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Seq ID NO: 615 Protein sequence 
Protein Accession S: KP_002649.1 

J n 21 31 41 51 

ilRALLARUi CVLWSDSKG SNELHQVPSN CDCLKGGTCV SNKYPSNIHW OJCPKKF^ 
HCEIDKSKTC YEGSGHFYRG KASTOTOGRP CLPHHSATVL QQTYHAHRSD ALQU3LGKHN 
ySmpSrrs PWCYVOVGLK PLVQBCMVHD CADGXKPSSP PEELKPQCXK) ktlrprfkii 
WFA^YRRH HGGSVTYVOG GSLISPCWVI SATHCPIDYP KKEDYIVYLG 
^^S^ SSeJ^ LHKDYSADTL AHHKDIALUC niSKEGRCAQ PSRTIQTICL 
P^JS^ ^IT^K EJSTOYLYPE QLKMTWKLI SHRECQQPHY VG^VTTWJL 

^^^Sd scqctsggpl vcslqgrmtl tcivswsrgc alkdkpgvyt rvshflpwir 

SRTXBENGIiA L 

Seq ID NO: 616 DNA sequence 

Nucleic Acid Accession IIM_024422.1 

Coding sequence: 202.. 2907 

1 11 21 31 |1 |1 

inrv-AAAfSfa AAAGCCCCtT OOKtOUBMO CAGGOGCTTC AGAGAAGCTA AGAAAAGCAC 60 

nr^rrnnra- CGGCCCTCGC CCCGCGGAGC CXTTCCTACCC CGGCCOGAOG CTC3GGCCCGC 180 

^SS^S StggaggS 2J0 

ctSScGGC T^CCTGCT GACCCTCGCG ATCTTAATAT TTGCCAGtGA TOCCTGCAAA 300 

ScATCTTCC CTCCAAACTA-GAT6CCGAGA AACTTGrrGG TAGAGTTAAC 360 

GCTTTACWSC TGCAAATCTA ATTCATTCAA GTGATCCTGA CTTCCAAATT 420 

GTtSgTCTA TACAACAAAT ACTATTCTAT TGTCCTOGGA GAAG^T 480 

S?^?AT TACmCCAA CACTGAGAAC CAAGAAAAGA AGAAAATATT TCTCTTTTTG 540 

SIS^^ JScCr AAAGAAAAGA CATACTAAAG AAAAA^ 600 

AAGAGAAGAT GGGCTCCAAT TCCTTCTTCG ATGCTAGAAA ACTCCTTGGG 7CCTTTTCCA 660 

?????^C ScStTCA ATCTGACACG GCCCAAAACT ATACCATATA CTATTCCATA 720 

AGAGGTCCTG GAGTTGACCA AGAACCTC3GG AATTTATTTT ATGTGGAGAG AGACACTGGA 780 

^™ TCTaStCGT GA^ 340 

^^SScAA CTCCaStGG GTATACTCCA GAACTTCCAC TGCCCCTAAT AATCAAAATA 900 

S^^S^ SgATAACTA CCCAATTTTT ACAGAAGAAA CTTJ^ 960 

GAAAATTCCA GAGIGGGCAC TACTCTCGGA CAAGTGTGTG CTACTCACAA AGATGAGCCT 1020 

S^CGA^ A^ScGCCT GAAGTACTCC ATCATTGGGC AGGTGCCACC ATCACCCACC 1080 

TGcSSSc TACAGGCGTG ATCAC^ 11^0 

SSSIISS aSaCT^ GTTGAAAAT^ ^200 

™TATCATT AACATTGATG A«^ 1260 

ACATTTACTC GTACTTCTTA TGTGACATCA GTGGAAGAAA ATACAGTTGA TGTCGAAATC 1320 

^SSIS™ CTGTTGAGGA TAAGGACTTA CnGAAT^^ ^380 

I^??iAA SSSaTGA AAATGGCAAT TTTAAAATTC T^^ "JO 
^^AGTTC TTTCTGTAGT TAAGCCTTTG A^^ 

SSS??^ TAotSaTGA AGCTCCATO ^^60 

AGCAWGCAA CAGTTACTGT TAATGTAGAA GATCAGGATG AGGGCCCTGA GTGTAACCCT 1620 

SSJSS^ ctgSSSt GAAAGAAAAT GCAGAAGT©^ 1680 

AA^StATC ACCCAGAAAC AAGAAGTAGC AGTGGCATAA GGTATW 1740 

GGGTCACCAT T^TCAAAAT AC^ "00 

GATA^^ C^ScXSVT CAAAAATGGC ATATATAATA TTACAGTCCT TGCATCAGAC 1860 

CAaSaG^ GAACATGTAC GGGGACACTG GGCATTATAC TTCAAGACGT GAATGATAAC 1920 

AGcS^?^ TACCTAAAAA GACAGTGATC ATCTGCAAAC CCACCATGTC ATCTGCGGAG 1980 

AGrTCTACTT CAGAAGTACA GAGAATGTGG AGACTGAAAG CAATTA^ 2100 

CGTCTTTCCT ATCAGAATGA TCCTCCATTT GGCTCATATG TAGTACCTAT AACA^S^ 2160 

GATAGACTTG GCATGTCTAG TGTCACTTCA TTGGATGTTA CACTGTGTGA CTGCATTACC 2220 

GAAAATGACT GCACACATCG TGTAGATCCA AGGATTGGCG GTGGAGGAGT ^C^CTT^ 2280 

AAOTGGGCCA TCCTTGCAAT ATTGTXGGGC ATAGCATTGC TCTTTTGCAT CCTGTTTACG 2340 

gSct?SgG GATO^ 2400 

CAGCAGAACC TAATTGTATC AAACACAGAA GCTCCTGGAG ATGACAAACT CTATWGCG 2460 

AATGGCTTCA CAACCCAAAC TGTGGGOGCT TCTGCTCAGG GAGTTTGTQG CACX3GTGGGA 2520 

TC^GAATCA AAAACGGAGG TCAGGAGACC ATCGAAATGG TGAAAGGAGG ACACCAGACC 2580 

S^^^ TGGCCACCAT CACACCCTGG ACrC^ 2640 

AO^GTGG ACAACTCCAG ATACACTTAC TCGGAGIGGC ACAGTTTTAC TCAGCCCCGT 2700 

SJSSgaAA AAGTGTATCT GTGTAAT^ "60 

GTCCTCACAT ATAACTATGA AGGAAGAGQA TCGGTGGCTG GCTC^ 2820 

GAACGACAAG AAGAAGATGG GCTTGAATTT TTGGATAATT TGGAGCCCAA ATTTAGGACA 2880 

SJ^^ StG^TGAA GAGATGAGTG TGTTCT^^^ 2940 

TTTATGACTT TTAAAAAAAA TTACAAACCA AGAATTTTTT AAAGCJ^ 3000 

TGGGGGTTTT TCrCTCATTA TTTGGATGGA ATCTCTT^ 3060 

' AGACACTATA AACAAGTACA CAAATTTTTC AATTTTTACA TATTTTTAAA TTACTTATCT 3120 

TctSSaAG GAGGTCTACA GAGAAATTAA AGTCTGO^ 3180 

ATGACAACAG CCAATTTATA GTGCAATAAA ATGTAATTAA TTCAAGTCCT TATTATAGAC 3240 

TATTTGAAGC ACAACCTAAT GGAAAATTGT AGAGACCTTG CTTTAACATT ATCTCCASTT 3300 

AATTAAGTGT TCATGTGGTG CTTGGAAACT GTTGTTTTCC TGAACATCTA AAGTGTGTAG 3360 

ACTGCATTCr TCCTATTATT TTATTCTTGT AATGTGACCT TTTCACTGTG CAAAGGGAGA 3420 
TTTCTAGCCA GGCATTGACT ATTACAATTT CATT 

Seq ID NO: 617 Protein sequence 
Protein Accession 8: HP_077740,1 

1 11 21 31 41 SI 

MEaAOPSGSW KGALCRLLLL TLAILIPASD ACKNVTmVP SKLDAEKLVG RVNLKECPTA 
"^^Z ?SJSgSVY TTNTILLSSE KRSFTILLSN TQiQEKKKIP VPLEHQTKVL 
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KKBHTKEiCVL RHAKHRWAPI PCSMLBNSLG PPPLFUJQVQ 
EPBMLFYVER DTCaJLYCTRP VDREQVESFS I IAFATT PDG 
PIFTEETYTF TIFSNCSVGT TVQQVCATDK PSPDTKaTRL 
TGVITTTSSQ LDRELIDICfQ LKIKVQOMDG QYFGMJTTST 
VTSVBENTVD VSILSVTVEO KDLVNTANWH ANYTILKGNE 
KPIJJYEEKQQ MILQIGWNE APFSREASPH SAMSTATVTV 
KENAEVGTTS KGYKAYDPET RSSSGIRYKK LTDPTGWVTI 
XNGIYNTTVL ASDQGGRTCT GTLGXZIiQDV NDHSPPIPKK 
EPIHGPPFDF SI*BSSTSEVQ RMfRLKAIND TAARLSYOJJD 
VTSIiDVTLCD CITESDCTHR VDPRIGGGGV QLGICMAILA.I 
TSKQPKVIPD DLAQQNLIVS NTEAPGDDKV YSANGFTTQT 
QBTIH-WKCG HQTSSSC8GA <2JHHTU3SCR GGHTEVDNC31 
dODENHKHA QDYVLTYNYE GRGSVAGSVG 
R 

Seq ID NO: 618 DNA sequence 
Kucleic Acid Accession NM_004949.l 
Coding sequence: 202*. 2745 



PCTAJS02/12476 



SDTAQNYTIY 
rrPEZ^LFLX 
KYSIIGQVPP 
CZIHZBDVKD 
KGNFKIVTDA 
HVEDQDEGPE 
DENTGSIKVF 
TV22CKPTNS 
PPPGSYWPI 
LLGIALLFCI 
VQASAQGVOG 
YTYSEHHSFT 
LEFLDNLEPK 



YSIRGPGVDO 

SPTIiFSKBFT 
HLPTPTRTSY 
KTNBGVLCW 
CNPPIQTVRM 
RSLDRSASTI 
SAEZVAVDPD 
TVRDRLGMSS 
LPTLVCGASG 
TVGSGIKNGG 
QPRLGHKVYL 
FRTIiAEACMK 



OGCCAAAGGA 
CTCTCCGOGC 
GCTCCGGCXX5 
GACCTGCCCC 
CTCTGCCGGC 
AATGTGACAT 
CTGAAAGAGT 
TTGGAGGATG 
TTTACCATAT 
GAGCATCAAA 
AAGAGAAGAT 
CTTTTCCTTC 
AGAGGTCCTG 
AACTTGTATT 
TTTGCAACAA 
GAGGATGAAA 
GAAAATTGCA 
GA CftOGA TGC 
CTATTTTCTA 
GAGTTAATTG 
GGTCTACAGA 
ACATTTACTC 
TTACGAGTTA 
ACCATTTTAA 
GAAGGAGTTC 
CAAATTGGTG 
AGCACAGCAA 
CCAATACAGA 
AAAGCATATG 
CCAACAGGGT 
GATAGAGAGG 
CAAG6AGGGA 
AGCCCATTCA 
ATTGTTGCGG 
AGTTCTACTT 
OGTCTTTCCT 
GATAGACTTG 
' GAAAATGACT 
AAGTGGGCCA 
CTGGTCTGTG 
CAGCAGAACC 
AATGGCTTCA 
TCAGGAATCA 
TCGGAATCCT 
ACGGAGGTGG 
CTTGGTGAAG 
GTATCTGTGT 
CTATGAAGGA 
AGATGGGCTT 
CATGAAGAGA 
AAAAAATTAC 
TCATTATTTG 
AGTACACAAA 
TCTACAGAGA 
TTTATAGTGC 
CCTAATGGAA 
GTGGTGCTTG 
ATTATTTTAT 
TTGACTATTA 



11 
I 

AAAGOCCCTT 
GCCCCACCTC 
OGGCCCTCGC 
GAGCCCTCTC 
TGCTCCTGCT 
TACATGTTCC 
GCTTTACA6C 
GTTCAGTCTA 
TACTTTCCAA 
CAAAGGTCCT 
GGGCTCCAAT 
AACAGGTTCA 
GAGTTQACCA 
GTACTOGTCC 
CTCCAGATGG 
ATGATAACTA 
GAGTGGGCAC 
ACAGACGCCT 
TGCATCX»AC 
ACAAGTACCA 
CAACTTCAAC 
GTACTTCTTA 
CTGTTGAGGA 
AGGGCAATGA 
TTTGTGTA6T 
TAGTTAATGA 
CAGTTACTGT 
CTGTTCGCAT 
ACCCftGAAAC 
GGGTCACCAT 
CA6AGACCAT 
GAACATCTAC 
TACCTAAAAA 
TTGATCCTGA 
CAGAAGTACA 
ATCAGAATGA 
GCATGTCTAG 
GCACACATCG 
TCCTTGCAAT 
GGGCTTCTGG 
TAATTGTATC 
CAACCCAAAC 
AAAACGGAGG 
GC06GGGGGC 
ACAACTGCAG 
AATCCATTAG 
AATCAAGATG 
AGAGGATCGG 
GAATTTTTGG 
TGAGTGTGTT 
AAACCAAGAA 
QATGGAATCr 
TTTTTCAATT 
AATTAAAGTC 
AATAAAATGT 
AATTGTAGA6 
GAAACTGTTG 
TCTTGTAATG 
CAATTTCATT 



21 
I 

G6ATGAGAGG 
CTCOGCCTCG 
CCCGOGGAGC 
CATGGAGGCA 
GACCCTGGGG 
CTCCAAACTA 
TGCAAATCTA 
TACAACAAAT 
CACTGAGAAC 
AAAGAAAAGA 
TCCTTGTTCG 
ATCTGACAGG 
AGAACCTOGG 
TGTAGATOGT 
GTATACTCCA 
CCCAATTTTT 
TACTGTGGGA 
GAAGTACTCC 
TACAGGCGTG 
GTTGAAAATA 
TTGTATCATT 
TGTGACATCA 
TAAGGACTTA 
AAATGGCAAT 
TAAGCCTTTG 
AGCTCCATTT 
TAATGTAGAA 
GAAAGAAAAT 
AAGAAGTAGC 
TGATGAAAAT 
CAAAAATGGC 
GGGGACACTG 
GACAGTGATC 
TGAGCCTATC 
GAGAATGTGG 
TCCTCCATTT 
TGTCACTTCA 
TGTAGATCCA 
ATTGTTGGGC 
GACCTCTAAA 
AAACACAGAA 
TGTGGGCGCT 
TCAGGAGACC 
TGGCCACCAT 
ATACACTTAC 
AGGACACACT 
AAAATCACAA 
TGGCTGGGTC 
ATAATTTGGA 
CTAATAAGTC 
TTTTTTAAAG 
CTTTGGTCAA 
TTTACATATT 
TGCCTTATTT 
AATTAATTCA 
ACCTTGCTTT 
TTTTCCTGAA 
TGACCTTTTC 



31 
I 

CAOGOGCTTC 
CGCTCCTCCT 
CCTCCTACCC 
GCCOGCCCCT 
ATCTTAATAT 
GATGCOGAGA 
ATTCATTCAA 
ACTATTCTAT 
CAAGAAAAGA 
CATACTAAAG 
ATGCTAGAAA 
GCCCAAAACT 
AATTTATTTT 
GAGCAGTATG 
GAACTTCCAC 
ACAGAAGAAA 
CAAGTGTGTG 
ATCATTGGGC 
ATCACCACAA 
AAAGTACAAG 
AACATTGATG 
GTGGAAGAAA 
GTX2AATACTG 
TTTAAAATTG 
AATTATGAAG 
TGCAGA6AG6 
GATCAGGAT6 
GCAGAAGTGG 
AGTGGCATAA 
ACAGGATCAA 
ATATATAATA 
GGCATTATAC 
ATCTGCAAAC 
CATG6CCCRC 
AGACTGAAAG 
GGCTCATATG 
TTGGATGTTA 
AGGATTGGCG 
ATAGCATTGC 
CAACCAAAAG 
GCTCCTGGAG 
TCTGCTCAGG 
ATCGAAATGG 
CACACCCTGG 
TOGGAGTGGC 
CIGATTAAAA 
GCATGCCCAA 
TGTAGGTTGT 
GCCCAAATTT 
TCTGAAAGCC 
CAGAAGATGC 
ATGCACATTT 
TTTAAATTAC 
GTTACATTTG 
AGTCCTTATT 
AACaVTTATCT 
CATCTAAAGT 
ACTGTQCAAA 



41 



I 

AGA6AASCTA 

GAGCAGOGGG 
CGGCCCGACG 
COGGCTCCTG 
TTGCCAGTGA 
AACTTGTTGG 
GTGATCCTGA 
TGTOCTOGGA 
AGAAAATATT 
AAAAAGTTCT 
ACTCCTTGGG 
ATACCATATA 
ATG TGGAGAG 
AATCTTTTGA 
TGCCCCTAAT 
CTTATACTTT 

ctactgacaa 
aggtgccacc 
catcatctca 
acatggatgg 
atgtaaatga 
atacagttga 
ctaactggag 
taacagatgc 
aaaagcaaca 
ctagtcCaag 
agogccctga 
gaacaacaag 

GGTATAAGAA 
TCAAAGTTTT 
TTACAGTCCT 
TTCAAGACGT 
CCACCATGTC 
CCTTTGACTT 
CAATTAATGA 
TAGTACCTAT 
CACTGTGTGA 
GTGGAGGAGT 
TCTTTTGCAT 
TAATTCCTGA 
ATGACAAAGT 
GAGTTTGTGG 
TGAAAGGAGG 
ACTCCTGCAG 
ACAGTTTTAC 
ATTAAACAAT 
GACTATGTCC 
TGCAGTGAAC 
AGGACACTAG 
AGTGGCTTTA 
TATTTGTGGG 
ACAGAGAGAC 
TTATCTTCTA 
GGTATAATGA 
ATAGACTATT 
CCAGTTAATT 
GTGTAGACTG 
GGGAGATTTC 



51 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



AGAAAAGCAC 
CCCAGACTGC 
CTCGGCCOSC 
GAACGGAGCC 
TGCCTGCAAA 
TAGAGTTAAC , 
CTTCCAAATT 
GAAGAGAAGT 
TGTCTTTTTG 
AAGGOSCGCC 
TCCTTTTCCA 
CTATTCCATA 
AGACACTGGA 
GATAATTGCC 
AATCAAAATA 
TACAATTTTT 
AGATGAGCCT 
ATCACCCACC 
GCTAGACAGA 
TCAGTATTTT 
CCACTTGCCA 
TGTGGAAATC 
AGCTAATTAT 
CAAAACCAAT 
GATGATCTTG 
ATCAGCCATG 
GTGTAACCCT 
CAATGGATAT 
ATTAACTGAT 
CAGAAGCCTG 
TGCATCAGAC 
GAATGATAAC 
ATCTG0G6AG 
TAGTCTG6AG 
TACAGCAGCA 
AACAGTGAGA 
CTGCATTACC 
ACAACTTGGA 
CCTGTTTACG 
TGATTTAGCC 
GTATTCTGCXa 
CACCGTGGGA 
ACACCAGACC 
GGGAGGACAC 
TCAGCCCCGT 
6AAAGAAAGT 
TGACATATAA 
QACAAGAAGA 
CAGAAGCATG 
TGACTTTTAA 
GGTTTTTCTC 
ACTATAAACA 
TCCAAGGAGG 
CAACAGCCAA 
TGAA6CACAA 
AAGT6TTCAT 
CATTCTTGCT 
TAGGCAGGCA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3460 



Seq ID NO: 619 Protein sequence 
Protein Accession ft: NP_004940.1 

I 11 21 31 41 51 

MEAARPSGSW NGALCRLLLL TLAILIFASD AdOJVTLHVP SKLDAEKLVC aVMLKBCPTA 



425 



GHTLIKN 

Seq It) NO: 620 OKA sequence 

Nucleic Acid Accession S: NM_03254S.l 

Coding sequences 46. .718 



1 11 21 31 41 51 

I I 1 1 I 1 



WO 02/086443 

ANLIHSSDPD FQILBDGSVY TMriLLSSE KRSPTILLSH TESQEKKKIP VFLEHOTKVL 120 

SLcRHKAPI KSHI^SLG- PFPLFMJQVQ SDTAQSmY VSIRGP^ 180 

DTGKLYCTRP VtSBQYESFE IIAFATTPOG YTPKLPLPLI I«fD^ "0 

PIPTSETYTP TIFEJCRVGT TVGQVCATDK DBPOTKHTRL KySIIGQVPP SPTLFSM^ 300 

TCVITTTSSQ LDRELIDKYQ LKIKVQDJOXS QYFGIiQTTST CIINIOOVND ELPTPTRTSV 360 

VTSvSSvD rai*VTVED KDLVOTAHWR ANYTILKGNE NGKFKIVTDA KTNEGVLCW 420 

SS^^ ^LQIGWNE APFSREASPR SAMSTATVTV NVEDQDEGPE OJPPIQTVRM 480 

^^^^ NGYKAYDPET RSSSGIRYKK LTOPTGHVTI DENTGSIKVF RSLDRBAETI 540 

WDQ^TCr GTLGIIUJDV NDMSPPIPXK T7IICKPTMS SAEIVAVDPD 600 

^ihSpS^ slSsSevo rmwrlkaikd TAARLSYQND PPFGSYWPI TVRDRI^ 660 

CITESTXrTHR VDPRIGGGGV QLCKVCAILAI LIX3IALLFCI LFTLVCGA^ 720 

TSMPKViro DLAQQNLIVS MTEAPaJDKV YSANGFTTQT VGASAQGVCG TVGSGI^^ 780 

S^^S^ HQtSsCRGA GHHHTLDSCR GGHTEVDNCR YTYSEWHSPT QPRLGEESm 840 



60 
120 



X 11 21 31 41 51 

iAACTGATCT icAATGCACT AAGAGAAGGA GACTCTCAAA CCAAAAATCA CCTGGAGGCA 
^ScicTTTA CGGTCAGTTT GGCATTACAG ATCATCAiOT ^GGAA^ 

CTATO^ GAGAAACATA ACGGCGGTAG AGAGGAAGTC ACCAAGGTTG CCACTCAGAA 180 

T^CTCA ACIGGACCTC CAGTCATTTC GGAGAGGTGA CTGGGAGCGC 240 

SS^S^ GGGCOGGACG AGCOGCTCCC CTACTCCCGG GCTTTOGGAG AGGGTGCGTC 300 

SSgOTGCCG CGCTGCTGCA GGAACGGOGG TACCTGOGTG CTGGGCAGCT TCTGOGTGTG 360 

CCTCGAGCAC GGAGCCTCGA CCCTCCGCGC CTGCCACCTC TGCAGGTGCA TCTTCGGGGC 480 

^CTGCACTGC CTCCCCCTCC AGACGCCTGA CCGCTCTGAC CCGAAACACT TCCTGGOTC 540 

CCaSct^ GGGCCGAGCG CCGGGGGCGC GCCCAGCCTG CTACTCTTGC TGCCCTGCGC 600 

SScCTGCAC CG^CTGC GCCCGGATGC GCCCGCGCAC CCTCGGTCCC TGGTCCCTTC 660 

SS^S^G O^GAgSc GCCCCroOGG AAGGOOGGGA CTTGGGCATC GCCTTTAATT 720 

SaATaSaG ATGTGTTTAG TTTACOGTAA GCTGAAGCAC TGQGTGAATA 780 

^OTtStGG GTAATAAATA TTTTCATGAA AGOGCCAAAA AAAAAAAAAA AAAAAAAAAA 840 
AAAAAA 

Seq ID NO: 621 Protein sequence 
Protein Accession ft: NP_115934.1 



60 
120 



MTWRHHVRLL FTVSLALQII NUSNSYQREK HNGGREEVTK VATQKHRQSP "WTS^E 
VrcaEGWGP EBPLPYSRAF GEGASARPRC CRNGGTCVLG SPCVCPAHFT GRY^HDQM 
SECGALEHGA WTLRACHLCR CIFGALHCLP LQTPDRCDPK DFLASHAHGP SAGQAPSLIi 180 
LLPCALLHRL LRPDAPAHPR SVJ9SVLQRE RRPCGRPGLG HRL 

Seq ID NO: 622 UNA sequence 

Nucleic Acid Accession «: FGENBSK predicted 

Coding sequence : 1 . . 3 9 0 

1 11 21 |1 1^ 

Itgaggttca GTCTCTCAGG CATGAGGACC GACTACCCCA GGAGTGTGCT GGCTCCTGCT 60 

TATGTGTCAG TCTCTCTCCT CCTCTTGTGT CCAAGGGAAG TCATOSCTOC OGCTQGCTCA 120 

gaaccatcgc tctcccagcc ggcacccagg tctcgagaca AGATCTACAA COCCTTGGAG 180 

CAGTGCTCTT ACAATGACGC CATOGTGTCC CTGAGCGAGA CCCGCCAATG TGGTCCCCCC 240 

TCCACCrrCT GGCCCTCCTT TGAGCTCTGC TGTCTTGATT CCrTTGGCCr CACAAACGAT 300 

TTTGTTGTGA AGCTXSAAGGT TCAGGGTGTC AATTCCCAGT GCCACTCATC TCCCATCTCC 360 
AGTAAATGTG AAA6A6GCG6. 6ATATGTTAG 

Seq ID NO: 623 Protein sequence 
Protein Accession #: FGENESH predicted 

1 11 21 31 41 51 

LpSVSGMRT DYPRSVLAPA YVSVCLLLLC PREVIAPAGS EPWLCQPAPR OGDKIYNPLE 60 
QCCYNDAIVS LSETRQGGPP CTFWPCPELC CLDSFGLTND FWKLKVQGV NSQCHSSPIS 120 

SKCERGRIC 

Seq ID NO: 624 DNA sequence 
Nucleic Acid Accession ft: M18728.1 
Coding sequences 51.. 1085 

1 11 21 31 41 51 

GGAGCTCAAG CTOCTCTACA AAGAGGTGGA CAGAGAAGAC AGCAGAGACC ATGGGACCCC 60 

CCTCAGCCCC TCCCTGCAGA TTGCATGTCC CCTGGAAGGA GGTCCTGCTC ACAGCCTCAC 120 

TTCTAACCTT CTOGAACCCA CCCACCACTG CCAAGCTCAC TATTGAATCC ACGCCATTCA 180 

ATGTCGCAGA QGGGAAGGAG GTTCTTCTAC TCGCCCACAA CCTGC CCCAG AATCGTATTG 240 

GTTACAGCTG GTACAAAGGC GAAAGAGTGG ATGGCAACAG TCTAATTGTA GGATATGTAA 300 

TAGGAACTCA ACAAGCTACC CCAGGGCCOG CATACAGTGG TCGAGACACA ATATACCCCA 360 

ATGCATCCCT GCTGATCCAG AACGTCACCC AGAATGACAC AGGATTCTAT ACCCTACAAG 420 

TCATAAAGTC AGATCTTGTG AATGAAGAAG CAACCGGACA GTTCCATGTA TACCCGGAGC 480 

TGCCCAAGCC CTCCATCTCC AGCAACAACT CCAACCCCGT G6AGGACAAG GATGCTGTGG 540 

CCTTCACXrre TGAACCTGAG GTTCAGAACA CAACCTACCT grogT^IA AWGGTCAGA 600 

GCCTCCCGGT CAGTCCCAGG CTGCAGCTGT OCAATGGCAA CATOACCCTC ACTCTACTCA 660 



426 
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GCX5TCAAAAG GAAOSATGCA GGATCCTATG AATGTGAAAT ACAGAAOCCA GOGA GTGCCA 720 

ACCGCAGTGA COCAGTCACC CTGAATGTOC TCTATGGCCC AG ATGTCCC C ACCAmCCC 780 

CCTCAAAGGC CAATTAOajT GCAGGGGAAA ATCTGAACCT CTCCIGCCAC GCAGOCTCTA 840 

ACCCACCT6C ACAGTACTCT TGGTTTATCA ATGGQAOGTT CCAGCAATCC ACACAAGAGC 900 

TCTTTATCCC CAACATCACT GTGAATAATA GOGGATCCTA TATGTGCX:AA GCCCATAACT 960 

CAGCCACTGG CCTCAATAGG ACCACAGTCA CC3ATGATCAC AGTCTCTGGA AGTGCTCCTG 1020 

TCCTCTCACC TCTGGCCAOC GTC3GGCATCA OGATrGGAGT GCTGGOCAGG GTGGCTCTGA 1080 
TATAGCAGCC CTGGTGTATT TTCGATATTT CAGGAAGACT GGCAGATTGG ACCAGACCCT .1140 

GAATTCTTCT AGCTCCTCCA ATCCCATTTT ATCCCATGGA ACCACTAAAA ACAAGGTCTG 1200 

CTCTGCTCCT GAA6CCCTAT ATGCTGGAGA TCGACAACTC AATGAAAATT TAAAGGGAAA 1260 

ACCCTCAGGC CTGAGGTGTG TGCCACTCAG AGACTTCACC TAACTAGAGA CAGTCAAACT 1320 

GCAAACCATG GTGAGAAATT GACGACTTCA CACTATGGAC AGCTTTPCCC AAGATGTCAA 1380 

AACAAGACTC CTCATCATCA TAAGGCTCTT ACCCCCTTTT AATTTCTCCT TGCTTATCCC 1440 

TGCCTCTTTC GCTTGGCAGG ATGATGCTCT CATTAGTATT TCACAAGAAG TACCTTCAGA ISOO 

GGGTAACTTA ACAGAGTCTC AGATCTATCT TGTCAATCCC AACGTTTTAC ATAAAATAAG 1560 

AGATCCTTTA GTGCACCCAG TCACTGACSIT TAGCAGCATC TTTAACACAG OOGTOTGTTC 1620 

AAATGTACAG TGGTCCTTTT CAGAGTTGGA CTTCTAGACT CArcTOTTCT CACTCCCTGT 1680 

TTTAATTCAA CCCAGCCATG CAATGCCAAA TAATAGAATT GCTCCCTACC AGCTGAACAG 1740 

GGAGGAG^CT GTGCAGTTTC TGACACTTGT TGTTGAACAT GGCTAAATAC AATGGGTATC 1800 

GCTGAGACTA AGTTGTAGAA ATTAACAAAT GTGCTGCTTG GTTAAAATGG CTACACTCAT 1860 

CTGACTCATT CTTTATTCTA TTTTA£5TTGG TTTGTATCTT GCCTAAGGTG CGTAGTCCAA 1920 

CrCTTGGTAT TAOCCTCCTA ATAGTCATAC TAGTAGTCAT ACTCCCTGGT GTAGTGTATT 1930 

CTCTAAAAGC TTTAAATGTC TCCATGCAGC CAGCCATCAA ATAGTGAATG CTCTCTCTTT 2040 

GGCTGGAATT ACAAAACTCA GAGAAATGTG TCATCAGGAG AACATCATAA OCCATGAAGG 2100 

ATAAAAGCCC CAAATGGTGG TAACTGATAA TAGCACTAAT GCTTTAAGAT TTGGTCACAC 2160 

TCTCACCTAG GTGAGCGCAT TGAGCCAGTG GTGCTAAATG CTACATACTC CAACTGAAAT 2220 

GTTAAGGAAG AAGATAGATC CAATTAAAAA AAATTAAAAC CAATTTAAAA AAAAAAAAGA 2280 

ACACAGGAGA TTCCAGTCTA CTTCAGTTAG CATAATACAG AAGTCCCCTC TACTTTAACT 2340 

TTTACAAAAA AGTAACCTGA ACTAATCIGA TCTTAACCAA TGTATTTATT TCTGTGGTTC 2400 

TGTTTCCTTG TTCCAATTTG ACAAAAOCCA CTGTTCTTGT ATPGTATTGC CCAGGGGGAG 2460 

CTATCACTGT ACTTGTAGAG TGGTGCPGCT TTAATTCATA AATCACAAAT AAAAGCCAAT 2520 
TAGCrCTATA ACT 



Seq ID NOt 62S Frotein sequence 
Protein Accession ft: AAAS9907.1 

1 11 21 31 41 51 

ilGPPSAPPCR LHVPWKEVLL TASI.LTFWNP PTTAKLTIES TPFNVAEGKE VLLLAHNLPQ 60 

NRIGYSWYKG ERVDGKSLIV GYVIGTQQAT PGPAYSGRET lYPNASLLIQ NVTQNDTGFY 120 

TLQVIKSDLV NEEATCQFHV YPELPKPSIS SNNSNPVEDK OAVAFTCEPS VQNTTYLWWV 180 

NGQSliPVSPR LQLSNGNMTI. TLLSVKRNDA GSYECEIQNP ASANRSDPVT UIVLYGPDVP 240 

TISPSKANYR PGENI^SCH AASNPPAQYS WFINGTFQQS TQELFIPMIT VNNSGSYMCQ 300 
AHNSATGLNR TTVTMITVSG SAPVLSAVAT VGITI6VLAR VALI 



Seq ID NO: 626 DNA sequence 
Nucleic Acid Accession S: H18728.1 
coding sequence: 1355.. 1657 

1 11 21 31 41 51 

GGAGCTCAAG CTCCTCTACA AAGAGGTGGA CAGAGAAGAC AGCAGAGACC ATGGGACCCC 60 

CCTCAGCCCC TCCCTGCAGA TTGCATGTCC CCTGGAAGGA GGTCCTGCTC ACAGCCTCAC 120 

TTCTAACCTT CTGGAACCCA CCCACCACTG CCAAGCTCAC TATTGAATCC ACGCCATTCA 180 

ATGTCGCAGA GGGGAAGGAG GTTCTTCPAC TCGCCCACAA CCTGCCOCAG AATOGTATTG 240 

CTTACAGCTG GTACAAAGGC 6AAAGAGTGG ATGGCAACAG TCTAATTGTA GGATAT6TAA 300 

TAGGAACTCA ACAAGCTACC CCAGGGCCOG CATACAGTGG TCGAGAGACA ATATACCCCA 360 

ATGCATCCCT GCTGATCCAG AACGTCACX:C AGAATGACAC AGGATTCTAT ACCCTACAAG 420 

TCATAAAGTC AGATCTTGTG AATGAAGAAG CAACCX3GACA GTTCCATGTA TACXXX3GAGC 480 

TGCCCAAGCC CTCCATCTCC AGCAACAACT CCAACCCCGT GGAGGACAAG GATGCTGTGG 540 

CCTTCACCTG TGAACCTGAG GTTCAGAACA CAACCTACCT GTGGTGGGTA AATGGTCAGA 600 

GCCrCCCGGT CAGTCCCAGG CTGCAGCTGT CCAATGGCAA CATGACCCTC ACTCTACTCA 660 

GCGTCAAAAG GAACGATGCA GGATCCTATG AATGTGAAAT ACAGAACCCA GC6AGTGCCA 720 

ACCGCAGTGA CCCAGTCACC CTGAATGTCC TCTATGGCCC AGATGTCCCC ACCATTTCCC 780 

CCTCAAAGGC CAATTACCGT GCAGGGGAAA ATCTGAACCT CTCCTGCCAC GCAGCCTCTA 840 

ACCCACCTGC ACAGTACTCT TGGTTTATCA ATGGGACGTT CCAGCAATCC ACACAAGAGC 900 

TCTTTATCCC CAACATCACT GTGAATAATA GCGGATCCTA TATGTGCCAA GCCCATAACT 960 

CAGCCACTGG CCTCAATAGG ACCACAGTCA CGATGATCAC AGTCTCTGGA AGTGCTCCTO 1020 

TCCTCTCAGC TGTCGCCACC GTCGGCATCA CGATTGGAGT GCTGGCCAGG GTGGCTCTGA 1080 

TATAGCAGCC CTGGTGTATT TTCGATATTT CAGGAAGACT GGCAGATTGG ACCAGACCCT 1140 

GAATTCTTCT AGCTCCTCCA ATCCCATTTT ATCCCATGGA ACCACTAAAA ACAAGGTCTG 1200 

CTCTGCTCCT GAAGCCCTAT ATGCTGGAGA TGGACAACTC AATGAAAATT TAAAGGGAAA 1260 

ACCCTCAGGC CTGAGGTGTC TGCCACTCAG AGACTTCACC TAACTAGAGA CAGTCAAACT 1320 

GCAAACCATG GTGAGAAATT GACGACTTCA CACTATGGAC AGCTTTTCCC AAGATGTCAA 1380 

AACAAGACTC CTCATCATGA TAAGGCTCTT ACCCCCTTTT AATTTGTCCT TGCTTATCCC 1440 

TGCCTCTTTC GCTTGGCAGG ATGATGCTGT CATTAGTATT TCACAAGAAG TACCTTCAGA 1500 

GGGTAACTTA ACAGAGTGTC AGATCTATCT TGTCAATCCC AACGTTTTAC ATAAAATAAG 1560 

AGATCCTTTA GTGCACCCAG TGACTGACAT TAGCAGCATC TTTAACACAG CCGTGTGTTC 1620 

AAATGTACAG TGGTCCTTTT CAGAGTTGGA CTTCTAGACT CACCTGTTCT CACTCCCTGT 1680 

TTTAATTCAA CCCAGCCATG CAATGCCAAA TAATAGAATT GCTCCCTACC AGCTGAACAG 1740 

GGAGGAGTCT GTGCAGTTTC TGACACTTGT TGTTGAACAT GGCTAAATAC AATGGGTATC 1800 

GCTGAGACTA AGrTCTAGAA ATTAACAAAT GTGCTGCTTG GTTAAAATGG CTACACTCAT 1860 

CTGACTCATT CTTTATTCTA TTTTAGTTGG TTTGTATCTT GCCTAAGGTG CGTAGTCCAA 1920 

CTCTTGGTAT TACCCTCCTA ATAGTCATAC TAGTACTCAT ACTCCCTGGT GTAGTG TATT 1980 

CTCTAAAAGC TTTAAATGTC TGCATGCAGC CAGCCATCAA ATAGTGAATG GTCTCTCTTT 2040 

GGCTGGAATT ACAAAACTCA GAGAAATGTG TCATCAGGAG AACATCATAA CCCATGAAGG 2100 

ATAAAAGCCC CAAATGGTGG TAACTGATAA TAGCACTAAT GCTTTAAGAT TTGGTCACAC 2160 
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TCTCACCTAG GTXSAGCGCAT TGAGCCAGTG GTGCTAAATG CTACATACTC CAACTGAAAT 2220 

GTTAAGGAAG AAGA7AGATC CAATTAAAAA AAATTAAAAC CAATTTAAAA AAAAAAAAGA 2280 

ACACAGGAGA TTCCftGTCIA CrPGAGTTAG CATAATACAG AA(3TCCCCTC TACTTTAACT 2340 

TTTACAAAAA AfiTAACCTGA ACTAAICTGA TGTTAACCAA TCXATrTAIT TCTGTGGTTC 2400 

TX3TTTCCTTG TTCCAArTTG ACAAAACOCA CTGTTCTTGT ATTGTATTGC OCAGGOGCSAG 24S0 

CTATCACTGT ACrTGTAGAG TGGTGCTGCT TTAATTCATA AATCACAAAT AAAAGCCAAT 2520 
TAGCTCTATA ACT 



Seq ID SO; 627 Protein sequence 
Protein Accession 6: AAA59908.1 



1 11 21 31 41 51 

t I i i I 1 

MDSFSQDVKT RLLIMIRI*LP PFNLSLLMPA SPAHQDDAVI SISQEVASEG NLTECQIVI.V 

NPNVLHKIRD PLVHPVTDIS SIFNTAVCSN VQWSFSELDF 



Seq ID NO: 628 USA sequence 
Nucleic Acid Accession ft: M18728.1 
Coding sequence: 2370.. 2501 

' 11 21 31 41 SI 

GGAGCTCAAG CTCCTCTACA AAGAGGTGGA CAGAGAAGAC AGCAGAGACC ATGG GACCC C 60 

CCTCAGCCCC TCCCTGCAGA TTGCATGTCC CCTGGAAGGA GGTCCTGCTC ACAGCCTCAC 120 

TTCTAACCTT CTGGAACCCA CCCACCACTG CCAAGCTCAC TATTGAATCC ACGCCATTCA 180 

ATGTCGCMSA GGGGAAGGAG GTTCTTCTAC TOGCCCACAA CCTGCCCCAG AATCGTATTG 240 

GTTACAGCTG GTACAAAGGC GAAAGAGTGG ATGGCAACAG TCTAATTGTA GGATATGTAA 300 

TAGGAACTCA ACAAGCTACC CCAGGGCCCG CATACAGTXW TCGAGAGACA ATATACCCCA 360 

ATGCATCCCT GCTGATCCAG AAOGTCACCC AGAATGACAC AGGATTCTAT ACCCTACAAG 420 

TCATAAAGTC AGATCTTGTG AATGAAGAAG CAACOQGACA GTTCCATGTA TACOCGGAGC 480 

TGCCCAAGCC CTCCATCTCC AGCAACAACT CCAACCCCGT GGAGGACAAG GATGCTGTGG 540 

CCTTCACCTG TGAACCTGAG GTTCAGAACA CAACCTACCT GTGGTGGGTA AATGGTCAGA 600 

GCCTCCCGGT CAGTCCCAGG CTGCAGCTGT CCAATGGCAA CATGACCCTC ACTCTACTCA 660 

GOGTCAAAAG GAAOGATGCA GGATCCTATG AATGTGAAAT ACAGAACCCA GCGAGTGCCA 720 

ACCGCAGTGA CCCAGTCACC CTGAATGTCC TCTATGGCCC AGATGTCCCC ACCATTTCCC 780 

CCTCAAAGGC CAATTACCGT CCAGGGGAAA ATCTGAACCT CTCCTGCCAC GCAGCCTCTA 840 

ACCCACCTGC ACAGTACTCT TGGTTTATCA ATGGGACGTT CCAGCAATCC ACACAAGAGC 900 

TCTTTATCCC CAACATCACT GTGAATAATA GCGGATCCTA TAT6TGCCAA GCCCATAACT 960 

CAGCCACTGG OCTCAATAGG ACCACAGTCA CGATGATCAC AGTCTCTGGA AGTCCTCCTC 1020 

TCCTCTCAGC TGTGGCCACC GTCGGCATCA CGATTGGAGT GCTGGCCAGG GTGGCTCTGA 1080 

TATAGCAGCC CTGGTGTATT TTCGATATTT CAGGAAGACT GGCAGATTGG ACCAGACCCT 1140 

GAATTCTTCT AGCTCCTCCA ATCCCATTTT ATCCCATGGA ACCACTAAAA ACAAGGTCTG 1200 

CTCTGCTCCT GAAGCCCTAT ATGCTGGAGA TGGACAACTC AATGAAAATT TAAAGGGAAA 1260 

ACCCTCAGGC CTGACGTGTG TGCCACTCAG AGACTTCACC TA ACTAG AGA CACTCAAACT 1320 

GCAAACCATG GTGAGAAATT GACGACTTCA CACTATGGAC AGCTTTTCCC AAGATGTCAA 1380 

AACAAGACTC CTCATCAT6A TAAGGCTCTT ACCCCCTTTT AATTTGTCCT TGCTTATGCC 1440 

TGCCTCTTTC GCTTGGCAGG ATGATOCTGT CATTAGTATT TCACAAGAAG TAGCTTCAGA ISOO 

GGGTAACTTA ACAGAGTGTC AGATCTATCT TGTCAATCCC AACGTTTTAC ATAAAATAAG 1560 

AGATCCTTTA GTGCACCCAG TGACTGACAT TAGCAGCATC TTTAACACAG COGTGTGTTC 1620 

AAATGTACAG TGGTCCTTTT CAGAGTTGGA CTTCTAGACT CACCTGTTCT CACTCCCPGT 1680 

TTTAATTCAA CCCAGCCATG CAATGCCAAA TAATAGAATT GCTCCCTACC AGCTGAACAG 1740 

GGAGGAGTCT GTGCAGTTTC TGACACTTGT TGTTGAACAT GGCTAAATAC AATGGGTATC 1800 

GCTGAGACTA AGTTGTAGAA ATTAACAAAT GTGCTGCTTG GTTAAAATGG CTACACTCAT 1860 

CTGACTCATT CTTTATTCTA TTTTAGTTGG TTTGTATCTT GCCTAAGGTG CGT AGTO CAA 1920 

CrCTTGGTAT TACCCTCCTA ATAGTCATAC TAGTAGTCAT ACTCCCIGGT GTAG IGTATT 1980 

CTCTAAAAGC TTTAAATGTC TGCATGCAGC CAGCCATCAA ATAGTGAATG GTCTCTCTTT 2040 

GGCTGGAATT ACAAAACTCA GAGAAATGTG TCATCAGGAG AACATCATAA CCCAT6AAGG 2100 

ATAAAAGCCC CAAATGGTGG TAACTGATAA TAGCACTAAT GCTTTAAGAT TTGGTCACAC 2160 

TCTCACCTAG GTGAGCGCAT TGAGCCAGTG GTGCTAAATG CTACATACTC CAACTGAAAT 2220 

GTTAAGGAAG AAGATAGATC CAATTAAAAA AAATTAAAAC CAATTTAAAA AAAAAAAAGA 2280 

ACACAGGAGA TTCCAGTCTA CTTGAGTTAG CATAATACAG AAGTCCCCTC TACTTTAACT 2340 

TTTACAAAAA AGTAACCTGA ACTAATCTGA TGTTAACCAA TGTATTTATT TCTGTGGTTC 2400 

TGTTTCCTTG TTCCAATTTG ACAAAACCCA CTGTTCTTGT ATTGTATTGC CCAOGGGGAG 2460 

CTATCACTGT ACTTGTAGAG TGGTGCTGCT TTAATTCATA AATCACAAAT AAAAGCCAAT 2520 
TAGCTCTATA ACT 



Seq ID MO: 629 Protein sequence 
Protein Accession ft: AAAS9909.1 

1 11 21 31 41 51 

11)111 
MLTNVFISW LFPCSNLTKP TVLVLYCPGG AITVLVEWCC PUS 



Seq ID NO: 630 DNA sequence 

Nucleic Acid Accession #: 1IM_016639.1 

Coding sequence: 40.. 429 

X 11 21 31 41 51 

GCGGCGG6CG CaSACAGGGG CGGGOGCAGG ACGTGCACTA TGGCTOGGGG CTOGCTGOGC 60 

CGGTTGCTGC GGCTCCTOGT GCTGGGGCTC TGGCTGGOGT TGCTGCGCTC CGTGGCOQGG 120 

GAGCAAGCGC CAGGCACCGC CCCCTGCTCC OGOGGCAGCT CCTGGAGOGC GGACCTGGAC 180 

AAGTGCATGG ACTGOGOGTC TTGCAGGGCG CGACCGCACA GOGACTTCTG CCTGGGCTGC 240 

GCTGCAGCAC CTCCTGCCCC CTTCCGGCTG CTTTGGCCCA TCCTTGGGGG CGCTCTGAGC 300 

CTGACCTTGG TGCTGGGGCT GCTTTCTGGC TTTTTGGTCT GGAGAOGATG CCGCAGGAGA 360 

GAGAAGTTCA CCACCCCCAT AGAGGAGACC GGCGGAGAGG GCTGCCCAGC TGTGGOGCTG 420 
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A7GCAGT6AC A A TGTGCCCC CTGCCAGCOG GGGCTOGCOC 
TTCTAGAGCC AGTCTCTGCC TCCCAGAOGC GGOGGGAGOC 
GGGGTGGGGG GOGGTCAATC ACCTCEGAGG CCPGGGCCCA 
AGCIGTCTCG TTGCCCTGCC TCTGGCTCCA GAACAGAAAG 
ACJVAAACAGC TGACACTGAC TAAGGAACTG CAGCATTTGC 
CCrrcCTTAG GACCTGGGGG CCAGGCTGAC TTGGGGGGCA 
TCACTCAGAT GTCCTGAAAT TCCACCACGG GGGT CACCCT 
TTAACACTAG GGGCTGGCCC ACTA GGAGG G CTGGC CCTAA 
CCCAAAGCGG GGMjGAGATA TTTATTTTGG GGAGAGTTTG 
AATAAAAGAA TCTTTAACTT TAAAAAAAAA AAAAAAAA 

Seq ID NO: 631 Protein sequence 
Protein Accession S: NP_0S7723.l 



PCTAJS02yi2476 



ACrCATCATT 
AAGCTOCTCC 
GGGTTCAGGG 
GGAfiOCTCAC 
ACAGGOGAGG 
GACTTGACAC 
GGGGGGTTAG 
GATACAGACC 
GAGGOGAGGG 



CATTCATCCA 
AACCACAAGG 
GAAOCTTCCA 
GCT6GCTCAC 
GGGGTGCCCT 
TAGGCCCCAC 
GGACCTATTT 
COCCCAACTC 
AGAATTTATT 



11 



21 



31 41 SI 

t 1 1 > 1 I 

MARGSLRRLL RLLVLGLWLA LtiRSVAGEQA PGTAPCSRGS SWSADU5KCM DCASCRARPH 
SDFCMCAAA PPAPFRLLHP ILGGALSLTF VLGU.SGFLV WRRCRRRBKP TTPrEBTOGB 
GCPAVALIQ 

Seq ID NO: 632 DKA sequence 

Nucleic Acid Accession S: KM_003B16.1 

Coding sequence: 79.. 2538 



1 
I 

CGGCAGGGTT 
CCTGOGGAAT 
OGGTGGTTGC 
CAACAGACCT 
AGAA6AGAAG 
AAAGAGCATA 
TATACTTACA 
CATTATCGGG 
GGACTCAGAG 
AGCTCTCATT 
TGTGGAGTTT 
GCCAGCATGA 
GAGCTGTTCA 
GTGAGAGAAG 
ATTCGAATTG 
GGGGGTGCTG 
CGTCGGAGAC 
ATGGGATTTG 
CAAATCACTG 
ATGAATCACG 
GGAGCATCGG 
TTAAATAAAG 
CCCTCCTGTG 
GAATGTGAAT 
TGTGCATATG 
GGAAAAACCA 
CCAGATGTTT 
GGCATGTGCC 
GCCCCCAAAG 
TTCTCTGGCA 
TGTGAGAATG 
AGTCGAGGCA 
GGGATGGTTA 
GTAGATGCTT 
GTATGTAATA 
ACTAAAGGAT 
TTGAGGGACG 
TTTATCTTCA 
ACATATGAGT 
OGACATGTTT 
GTACCAACCT 
COGAAAGTAT 
TATAGTTCCC 
CTAATACTTT 
GAAAACAAAA 
AGTTGTGAAA 
CATCATTGAA 
TGAACATGTT 
AGTGTTTAAG 
AACATGTGAT 
TTTTTCATCA 
CAT GAATA AG 
TTATTTTGAA 
TCCATTTTTA 
GAATTTCTAT 
TAAATTATAA 
GGCTATAATA 
CTTGAGAATT 
AGAATGTTTA 
CATAGAAATT 
TTACTGTGGT 



11 
I 

GGAAAATGAT 

CGGCCGAGAT 
TGTTGCTTGG 
CACATCTTTC 
CCCCTAGGCC 
TTATTCACTT 
ACAAGGAAGG 
GCTATGTGGA 
GATTGCTGCA 
TTGAGCACAT 
CCAACAAGGA 
CTCAGCTACT 
TTGTCGTAGA 
AGATGATTCT 
TGCTAGTTGG 
GTGATGTGCT 
ATGACAGTGC 
TGGGAACAGT 
TGGAGACATT 
A7GATGGGAG 
GTTCCAGAAA 
GAGGAAACTG 
GTAATAAGTT 
TGGACCCTTG 
GTGACTCJTTG 
GTGAGTGTGA 
TTATTCAGAA 
AGTATTATGA 
ATTGTTTCAT 
ATGAATACAA 
TACAAGAGAT 
CCAAATGTTG 
ACGAAGGCAC 
CTGTTCTGAA 
GCAATAAGAA 
ACGGAGGAAG 
GACrrCTGGT 
TCAA6AGGGA 
CAGATQGCAA 
CTCCAGTGAC 
ATGCAGCCAA 
CATCTCAGGG 
TCACTTGATT 

TTTTTrrrcT 

CACCACAAAA 
TACAAGGAAA 
TAAGTCTTAT 
ATTGCAGTGA 
TGTTATTCTG 
AATCTAATAC 
TGCACX3AATT 
CAAATATTGT 
AGTACAAAAT 
TGACCTTTCA 
TATGAATCAT 
GCTTTAAGGT 
AAGCAGGAGC 
TCATGAGCAC 
CATTTACTAA 
AGGCTOGAGA 
ATCTATGAGT 



21 

1 

0GAAGAGG06 
GGGGTCTGGC 
CCTGGTGGGC 
TTCTTATGAA 
CTATTCAAAA 
GGAAAGGAAC 
GACTTTAATC 
GGGAGTTCAT 
TTTAGAGAAT 
CATTTATCGA 
TATAGAGAAA 
TCGAAGAAGA 
CAAGGAAAGG 
CCTGGCAAAC 
ACTGGAGATT 
GGG6AACTTC 
ACAGCTAGTT 
GTGTTCAAGG 
TGCTTCCATT 
AGATTGTTCC 
CTTTAGCAGT 
CCTTCTTAAT 
GGTGGACGCT 
CTGCGAAGGA 
TAAAGACTGT 
TGTTOCAGAG 
TG6ATATCCT 
TGCTCAATGT 
TGAAGTGAAT 
GAAGTGTGCC 
ACCTGTATTT 
GGGTGTGGAT 
AAAATGTGGT 
TTATGACTGT 
TTGTCACTGT 
TGTGGACAGT 
CTTCTTCTTC 
TCAACTGTGG 
AAATCAAGCA 
ACCTGOCAGA 
GCAACCTCAG 
AAACTTAATT 
TTTTTAACCT 
TGATGTTTTC 
CAGACTTCAC 
TGCAGTAAAG 
TCAGTCATCG 
TTCTCAAATT 
AATTTTCTAC 
CTGTGAAAAC 
AATAATCATC 
CTTCAAAAGA 
ATACTAAAAG 
ACTATAGGTA 
GTGAAA6CAT 
ACGAAGTATT 
AATTATAAAA 
TTTAAAATCT 
GGTGTGCTGG 
AAGAAGGAAG 
TATdTCTTA 



31 
I 

GAG6TGGAG6 
GCGOGCTTTC 
CCAGTCCTCG 
ATTATAACTC 
CAAGTATCTT 
AAAGACCTTT 
ACTGAOCATC 
AATTCATCCA 
GCGAGTTATG 
ATGGATGATG 
GAAACTGCAA 
AGAGCTGTCT 
TATGACATGA 
TACTTGGATA 
TGGACCAATG 
GTGCAGTGGC 
CTAAAGAAAG 
AGCCACGCAG 
GTTGCTCATG 
TGTGGAGCAA 
TQCAGTGCAG 
ATTCCAAAGC 
GGGGAAGAGT 
AGTACCTGTA 
CXjGTTCCTTC 
TACT6CAATG 
TGCCAGAATA 
CAAGTCATCT 

tctaaaggtg 
actgggaatg 
ggaattgtgc 
ttccagctag 
gctggaaaga 
gatgttcaga 

tSAAAATGGCT 
GGACCTACAT 
CTAATTGTTC 
AGAAGCTACT 
AAGCCTTCTA 
GAAGTTCCTA 
CAGTTCCCAT 
CCTGCCCGTC 
TCTTTTTGCA 
TTGAAAAGCC 
TAACACAGAA 
CCAGGGAATT 
GTGAGGTTAA 
AACTGTATTG 
CTTAGTTATC 
TGACTAATCA 
ATACTCTAGA 
ATGCACAAGA 
AGTGTGTGTG 
ATAACTCTTA 
GACATTCGTT 
TAATAGATCT 
TCTTCAATCA 
GAACTTTCAA 
GTCATGTAAA 
AAATGGTTTT 
GCTGTGTTAA 



41 SI 

I I 
CGACCGAGTG CTGAGAGGAA 
CCTCGGGGAC CCTTOGTGTC 
GTGCGGCGCG GCCAGGCTTT 
CTTGGAGATT AACTAGAGAA 
ATGTTATTCA GGCTGAAGGA 
TGCCTGAAGA TTTrGTGGTT 
CCAATATACA GAATCATTGT 
TTGCTCTTAG CX5ACTGTTTT 
GGATTGAACC CCTGCAGAAC 
TCTACAAAGA GCCTCTGAAA 
AGGATGAAGA GGAAGAGCCT 
TGCCACAGAC CCGGTATGTG 
TGGGAAGAAA TCAGACTGCT 
GTATGTATAT TATGTTAAAT 
GAAACCTGAT CAACATAGTT 
GGGAAAAGTT TCTTATCACA 
GTTTTGGTGG AACTGCAGGA 
GCGGGATTAA TGTGTTTGGA 
AATTGGGTCA TAATCTTGGA 
AGAGCTGCAT CATGAATTCA 
AGGACTTTGA GAAGTTAACT 
CTGATGAAGC CTATAGTGCT 
GTGACTGTGG TACTCCAAAG 
AGCTTAAATC ATTTGCTGAG 
CAGGAGGTAC TTTATGCCGA 
GTTCTTCTCA GTTCTGTCAG 
ACAAAGCCTA TTGCTACAAC 
TTGGCTCAAA AGCCAAGGCT 
ACAGATTTGG CAATTGTGGT 
CTTTGTGTGG AAAGCTTCAG 
CTGCTATTAT TCAAACGCCT 
GATCAGATGT TCCAGATCCT 
TCTGTAGAAA CTTCCAGTGT 
AAAAGTGTCA TGGACATGGG 
GGGCTCCCCC AAATTGTGAG 
ACAATGAAAT GAATACTGCA 
CCCTTATTGT CTGTGCTATT 
TCAGAAAGAA GAGATCACAA 
GACAGCCGGG GAGTGTTCCT 
TATATGCAAA CAGATTTGCA 
CAAGGCCACC TCCAC CACA A 
CTGCTCCTGC ACCTCCTTTA 
AATGTCTTCA GGGAACTGAG 
TTTCTGTTGC AACTATGAAT 
AAACAGAAAC TGAGTGTGAG 
TACAATAACA TTTCC GTTTC 
TGCACTAATC ATG6ATTTTT 
GTGTAAGATT TTTGTCATTA 
ATTAATGTAG TTCCTCATTG 
GCTGCCAATA ATATCTAATA 
ATCTTGTCTG TCACTCACTA 
ACCACAATTA AGATGTCATA 
TATTCACGCA GTTACTCGCT 
GAGAAATTAA TTTAATATTA 
CACAATAGCA CTATTTTAAA 
AATCAAATAT GTTGATTCAT 
ATTGAACTTT TACAAAACCA 
AGCTTGCTAT TAAATCATTT 
ATATTAGACA CTAATATTTT 
CTTAAATACC TACAAAAAAG 
AAATGAATTT TTACPATGGC 



480 
540 
60O 
660 
720 
780 
840 
900 
960 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 



429 



wo 02/086443 

AGA3ATQGTA TCGATOSTAA AATTTTAAGC ACTAAAAATT TTTTCATAAC CTTTCA TAAT 3720 
AAAGTTTAAT AATAGGTTTA TTAACTGAAT TTCATTAGTr TTTTAAAACrr GTTTTTGGTT 3780 
TGTGTATATA TACATATACA AATACAACAT TTACaiATAAA TAAAATACTT GAAATTCTCA 3840 
AAAAAAAAAA AAAAAAAAAA AAAAA 



Seq ID NO: 633 Protein sequence 
Protein Accession |: KP_003807.l 

1 11 21 31 41 51 

i I t i ) i 

KGSGARPPSG TLRVRWL»LLIi GLVGPVLGAA RPGFQOTSHL SSYEIITPHR LTREIREAPR 60 

PYSKQVSYVI QAEGKEHIIH LERNKDLLPE DFWYTYNKB GTLrTOHPNI ONHCHYRGYV 120 

EGVHSSSIAL SDCFGtRGLL HLEMASYGIE PWJHSSHFEH IIYRMDDVYK EPLKCX3VSNK 180 

DIEKETAXDE EESPPSMIOL LRSRRAVLPQ TRYVELPIW DKERYDMXGR KQTAVREEMI 240 

LLANYLDSMY IKLNIRIVLV GLEIWTNGNL INIVGGAGDV I/2JPVQWREK FIiITRRRHDS 300 

ADLVLKKGFG GTAGMAFVGT VCSRSHAGGI KVFGOITVET FASIVAHBLG HNIXMOmDG 360 

RDCSCGAKSC IMNSGASGSR NFSSCSAEDF EKLTLNKGGN CLLNIPKPDB AYSAPSOGNK 420 

IVDAGEBCDC GTPKECELDP CCEGSTCKLK SFAECAYGDC CKDCRFLPGG TLCRGKTSEC 480 

DVPEYCNGSS QFCQPDVFIQ KGYPOQNNKA YCYNGMCQYY DAQCQVIPGS KAKAAPKDCP 540 

IBVNSKGDRF GNCGPSGNBY KKCATGNALC GKLQCENVQE IPVFGIVPAI IQTPSaGTKC 600 

WGVDFQLGSD VPOPGMVNEG TKOGAGKICR HPOCVUASVL NYDCDVQKKC HGHGVCSSNK 660 

NCHCENGWAP PNCETKGYGG SVDSGPTYNE MOTALRDGLL VPFFLIVPLI VCAIFIPIKR 720 

i3QLWRSYFRK KRSOTYESDG KNOANPSRQP GSVPRHVSPV TPPREVPIYA MRFAVPTYAA 780 
KQPQQPPSRP PPPQPKVSSQ QUiIPASPAF APPLYSSLT 

Seq ZD HO: 634 i»lA seque&ce 

Nucleic Acid Accession #s KM_002091.1 

coding sequence: 56..S03 

1 11 21 31 41 51 

iGTCrCTGCT CTTCCCAGCC TCTCOGGCGC GCTCCAAGGG CrrCCCX3TCG GGACCATGOG 60 

CGGCAGTGAG CTCCCXSCTGG TCCTGCTGGC GCTGCSTCCTC TGCCTAGOGC CCCGGGGG06 120 

AGCGGTCCCG CTGCCTGCGG GCX3GAGGGAC CGTGCTGACC AAGATGTACC GGOGCGQCAA 180 

CCACTCGGC5G GTGGGGCACT TAATGGGGAA AAAGAGCACA GGGGAGTCTT CTTCTGTTTC 240 

TGAGAGAGGG AGCCTGAAGC AGCAGCTGAG AGAGTACATC AGGTGGGAAG AAGCTGCAAG 300 

GAATTTCCTG GGTCTCATAG AAGCAAAGGA GAACAGAAAC CACCAGCCAC CTCAACXXrAA 360 

GGCCTTCGGC AATCAGCAGC CTTCXSTGGGA TTCAGAGGAT AGCAGCAACT TCAAAGATGT 420 

AGGTTCAAAA GGCAAAGTTC GTAQACTCTC TGCTCCAGGT TCTCAACGTG AAGGAAGGAA 480 

CCCCCAGCTG AACCAGCAAT GATAATGATG GCCTCTCTCA AAAGAGAAAA ACAAAACCCC 540 

TAAGAGACTG AGTTCTGCAA GCATCAGTTC TACXK3ATCAT CAACAA GATT TCCTTGTGCA 600 

AAATATTTGA CTATTCTGTA TCTTTCATCC TTGACTAAAT TCGTGATTTT CAAfiCftOCAT 660 

CTTCTGGTTT AAACTTGm GCTGTGAACA ATTGTCGAAA AGAGTCTTCC AATTAATGCT 720 

TTTTTATATC TAGGCTACCT GTTQGTTAGA TTCAAGGQCC GGAGCTGTTA CCATTCACAA 780 
TAAAAGCTTA AACACAT 

Seq ID NO: 635 Protein sequence 
Protein Accession NP_0020B2.1 

1 11 21 31 41 51 

ArGSELPIiVL LALVLCLAPR GRAVPLPAGG GTVbTKMYPR GNHWAVGHLH GKKSTGESSS 60 

VSERGSLKQQ LREYIRWEEA ARNLLGLIEA KENRNHQPPQ PKALGMQQPS WDSEDSSNPK 120 
DVGSKGKVGR LSAPGSQREX3 BNPQUIQQ 



Seq ID NO: 636 DNA sequence 

Kucleic Acid Accession #: KM_016522.1 

Coding sequence: 265.. 1299 

1 11 21 31 41 51 

GCGGAAGCAG CGAGGAGGGA GCCCCCTTTG GCCX3TCCTCC GTGGAACCGG TTTTCOGAGG 60 

CTGGCAAAAG CCGAGGCTGG ATTTGGGGGA GGAATATTAG ACTCGGAGGA GTCTGCGCGC 120 

TTTTCTCCTC CCCGCGCCTC CCGGTCGCCG CGGGTTCACC GCTCAGTCCX: CGCGCTCXSCT 180 

CCGCACCCCA CCCACTTCCT GTGCTCGCCC GGGGGGCGTG TGCCGTGCGG CTGCCGGAGT 240 

TCGGGGAAGT TGTGGCTGTC GAGAATGGGG GTCTGTGGGT ACCTGTTCCT GCCCTGGAAG 300 

TGCCTCGTGG TCGTGTCTCT CAGGCTGCTG TTCCTTGTAC CCACAGGAGT GCCCGTGCGC 360 

AGOGGAGATG CCACCTTCCC CAAAGCTATG GACAAOSTGA 0GGTCC3GGCA GGGG6A6AGC 420 

GCCACCCTCA GGTGCACTAT TGACAACCGG GTCACCCGGG TGGCCTGGCT AAACOGCAGC 480 

ACCATCCTCT ATGCTGGGAA TGACAAGTGG TGCCTGGATC CTOGCGTGGT CCTTCTGAGC 540 

AACACCCAAA CGCAGTACAG CATCGAGATC CAGAACETGG ATGtGTATGA CGAGGGCCCT 600 

TACACCTGCT CGGTGCAGAC AGACAACCAC CCAAAGACCT CTAGGGTCCA CCTCATTGT6 660 

CAAGTATCTC CCAAAATTGT ACAGATTTCT TCAGATATCT CCATTAATGA AGGGAACAAT 720 

ATTAGCCTCA CCTGCATAGC AACTGGTAGA CCAGAGCCTA CGGTTACTTG GAGACACATC 780 

TCtCCCAAAG CGGTTGGCTT TGTGAGTGAA GAOGAATACT TGGAAATTCA GGGCATCACC 840 

OGGGAACAGT CAGGGGACTA CGAGTGCAGT GCCTCCAATG ACGTGGCCGC GCCCX^tSGTA 900 

OGGAGAGTAA AGGTCA0CX3T GAACTATCCA CCATACATTT CAGAAGCX^A GGGTACAGGT 960 

GTOCCaSTGG GACAAAAGGG QACACTGCAG TQTGAAGCCT CAGCAGTCCC CTCAGCAGAA 1020 

rrCCAGTGGT ACAAGGATGA CAAAAGACTG ATTGAAGGAA AGAAAGGGGT GAAAGTGGAA 1080 

AACAGACCTT TCCTCTCAAA ACTCATCTTC TTCAATGTCT CTGAACATGA CTATGGGAAC 1140 

TACACTTGCG TGGCCTCCAA CAAGCTGGGC CACACCAATG CCAGCATCAT GCTATTTGGT 1200 

CCAGGCGCCG TCAGCGAGGT GAGCAACGGC ACGTCGAGGA GGGCAGGCTG OGTCTGGCTG 1260 

CTGCCTCTTC TGGTCTTGCA CCTGCTTCTC AAATTTTGAT GTGAGTGCCA CTTCCCCACC 1320 

CGGGAAAGGC TGCCGCCACC ACCACCACCA ACACAACAGC AATGGCAACA CCGACAGCAA 1380 

CCAATCAGAT ATATACAAAT GAAATTAGAA GAAACACAGC CTCATGGGAC AGAAATTTGA 1440 

GGGAGGGGAA CAAAGAATAC TTTGG6GGGA AAA6AGTTTT AAAAAA6AAA TTGAAAATTG ISOO 

CCTTGCAGAT ATTTAGGTAC AATGGAiGTTT TCTTTTCCCA MCGGGMGh ACACAGCACA 1560 
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10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 

85 



WO 02/086443 

ou a aa.Ti'O G rcccactgca agctgcatcg tgcaacctct ttggtgccag tgtggg»ag 

GGCrCAGCCr CTCTGCCCAC AGACT6CCOC CAOSIGCAAC ATTCTOGACC TGG<Jja'COC 
AAATTCAATC ACTOCanVOA GAOGAACAGA ATGAGAOCTT UCUUXCAAG ObTUUOW:!'! ' 
CXWaXXAAG ajrOGCXSCTG OGGGCACTTT eGTAGACTGT GCCAOCAOGG OGTGIGtTGT 
GAAAOGTGAA ATAAAAAGAC CAAAAAAAAA AAAAAAAAA 



PCTAJS02/12476 



1620 

i6ao 

1740 
1800 



Seq ID NO: £37 erocein sequence 
Protein Accession t: KP_0S7606.1 



1 
I 

MGVQGYLFLP 
NRVTRVAWUI 
KHPKTSRVHL 
SSDEYZ.BIQG 
LQCBASAVPS 



11 
I 

WKCLVWSLR 
RSTniYAQID 
XVQVSPKZVE 
ITREQSGDYE 
AEPQWYKDDK 
PGPGAVSEVS 



21 
I 

LLFLVPTGVP 
KWCLDPRWL 
ZSSDXSINEG 
CSASNDVAAP 
RLIEGKKGVK 
NGTSRRAGCV 



31 
1 

VRSGDATFPK 
LSNTQTOVSI 
ffirtSLTCIAT 
WRRVKVTVN 
VESJRPFLSKL 
WltLPLLVXiHL 



41 51 
I I 

AMDNVTVRQG ESATLRCTID 60 
EIQNVDVYDE GPYTCSVQTD 120 
GRPEPTVTWH HISPKAVGPV 180 
ypPYZSEJUCG TGVPVGQKGT 240 
XFFNVSEBOy CSTtO/hSiSK 300 
U>KF 



Seq ID HO: 638 SNA sequence 

nucleic Acid Accession §: NM_012261.1 

Coding sequence: 203.. 1045 



1 
I 

GATTTGCTCT 
ACAGAATAGG 
CACTGCAG06 
CCTCATTCGG 
ACTTCGAGTT 
GGAAAATCTC 
TGGGACGAGG 
GGCCAGCAAC 
TGA6GTGAAG 
OGCATATGCft 
GGCGACTTQG 
CAAAGACGCA 
CACCCCCGCT 
TGATC<3GCAG 
TATCTCAGAT 
GGAAGAAACC 
CGOGATTTAC 
ATCCCAGTAT 
CCAACTGGAT 
CATAGCTACA 
AACCCACGGA 
ATGCTGGGGA 
TGACrCTCCA 
TTGAAAACAT 
TGCTCCCTTG 
TCATGCTCCC 
GTTTAGTGAT 
AAAAOGACTA 
GGGGGACCTG 
TTCTCTGGC 



11 

I 

GCCAGCAGCT 
CGCTCCCTCC 
GOGACTTTGA 
GGCACTGCGA 
CTCCTGATGT 
TCAGGCCTTT 
TGTCTCATGG 
TAOGTAGATC 
GGCGGCT6TG 
CrCAAAATGC 
AGGCTGAGCA 
GTCAGTGCTG 
GGGAAGTCCT 
AAGACGGTCA 
TTTGTCTTCA 
TTGOCCCTGA 
CACGTCCACC 
AAGCACATGG 
CAGGTAGAAC 
ATCAAACAGG 
AGGGGGAGAC 
GGAGGGGAGG 
AAGAGCAATA 
GCTTCTTTGA 
GACACAGCTG 
TGCAGCAAGA 
TGTCTTGGGA 
ATGTAACTAT 
AA6AATCAAT 



21 
1 

GTOGGTGCOS 
CTCCOCCTTC 
GGGATTCCCT 
GTATGGATCT 
TGTTCCATAC 
CCAC TAACC C 
CAGAGTTTGC 
TGATCACAGA 
GCCACAQCCA 
TCTTTOTAAA 
AAGTGCAGTT 
GGAAGCACAC 
ATGAGTGTCA 
CCATGATCCT 
GTGAAGAGCA 
TTTTGGGGCT 
ACAAAATGAC 
GCTAGAGGCC 
AACAAAAGCA 
CCTGGGTATC 
TCTTTCGGAT 
AGGGTCTCAG 
AATCCCACTT 
GGAGGAAACC 
GCTTATCCTA 
CCCCTGAAAG 
ATGTTTCACT 
GCAGAGTTGT 
CTGTGTGAGT 



31 

1 

CGCTOGACAC 
TCTGTCCCCC 
CTCTGGCGGC 
CCAAGGAAGA 
AATGGCTCAA 
TGAAAAAGAT 
AGCCAAATTT 
ACAGGCOGAT 
GTCGGAGCTG 
GGAAAGCCAC 
TGTCTAOGAC 
AGCCAACTOG 
AGCTCAACAA 
GTCTGCGGTC 
TAAATGCCCA 
CATCTTGGGC 
TGCCAACCAG 
GTTAGGCAGG 
CTTTTCCATC 
TGACGCTTGC 
TTGTAGGGTG 
ACAGCTTTCG 
GGAGCTGTAT 
CCTTTAGGTT 
TACAGTTGTC 
TGATTCATGC 
GCTACCCGCA 
TTGQACTTCT 
CTGTTTTTCA 



41 

1 

OGAGTCCTAG 
GCCTCTCGCT 
CTCTGCAGCA 
GGGGTCOCCA 
ATCATGGCAG 
ATATTTGTG6 
ATTGTACCTT 
ATCGCATTGA 
CAAGTGTTCT 
AACATCTCCA 
TCCTOSGAGA 
CACCACCTCT 
ACCATTTCAC 
CACATCCAAC 
GTGGATGAGC 

CTCxrrcATCA 

GTGCAGATCC 
CAGCGCCTAT 
TTGTACACGA 
TTGGCTTGTG 
AAATGGCAAT 
TGCTCATGGT 
CTGGGOOCAA 
CAGAAGAATA 
AATGCACACA 
TTCTGGCTGG 
TCCAGCGACT 
TCCTGTGCCA 
AAATGAAATA 



51 

1 

CTAGGCGCTC 
CACCCCGGCC 
GCACAGCCGG 
GCAT06ACAG 
AACAAGAAGT 
TG0GG6AAAA 
ATGATGTGTG 
CCCGGGGAGC 
GGGTGGATCG 
AGGGACCTGA 
AAACCCACTT 
CTGCCTTGGT 
TGGCCTCTAG 
CTTTTGACAT 
GGGAGCAACT 
TGGTAACACT 
CrcXSGGACAG 
TGCTGCTCCC 
GATACACCAA 
TCCATGCTTA 
TATTCTCTCX: 
GGCTTGGCTT 
AGTTTAGGGA 
TQGGGTGCTT 
GAATACAACC 
CATTCTGCAT 
GCAGCACCAG 
GGTCXIAAGTC 
AAACACACTA 



Seq ID NO: 639 Protein sequence 
Protein Accession NP_036393.1 



1 
1 

GOCACGAGCC 
ACTATGAGCC 
GOGCTGCTCG 
GTCTCTCCTG 
CCCAAAAOGA 
GTGGTAGCCT 
AAGAAAGTCA 
ACCATGCATC 
CAGTAAGAAT 
GAAGAGTGTG 
CTAATATAGT 
CAATTGACCA 
TGAAGATAAC 
ATTTCXSTATG 
ACTCACTCTT 



11 
1 

AGTCrCCGGG 
TCCOGTCCAG 
CGCTGCTGCT 
TGCTGACAGA 
TTGGTAAACT 
CCCTGAA6AA 
TCCAGAAAAT 
ATAAAATTGC 
AAGAAGGAAG 
GGGGAAAGCC 
ATTTCCACTA 
TATTGTGAGC 
TATTGTATTT 
GAAATAA7GT 
CTCATAAAAT 



21 
I 

CCTCCACCCA 
CGGOGCGGCC 
CCTGCTGACG 
GCTGCGTTGC 
GCAGGT6TTC 
OGGGAAGCAA 
TTTGGACAGT 
CCAGTCTTCA 
GGTTGGTTTT 
TACGCTTCTC 
TTTACTGTTA 
AAAGAATCAC 
CTATCATACA 
TTTATTAGTG 
AGGAAATATT 



31 
1 

GCTCAGGAAC 
CGTGTGCOGG 
CCGCC3GGGGC 
ACTTGTTTAC 
CCXGCAGGCC 
GTTTGTCTGG 
GGAAACAAGA 
GCGGAGCAGT 
TTTCCATTTT 
CCTGAAGTTT 
TTTTACCTGA 
TGGTTATTAG 
TTCCTTAAAG 
TGCTGTTGAG 
TTAGTTCTGT 



41 
I 

CCGCGAACCC 
GTCCTTCGGG 
CCCTCGCCAO 
GCGTTACGCT 
CGCAGTGCTC 
ACCCGGAAGC 
AAAACTGAGT 
TTTCTGGAGA 
CtACATGGAT 
ACAGCTCAGC 
TAAGTTATTG 
TCTTTCAATG 
TCTTACCGAA 
GGAiOGTATOC 
TTTCTTGGGG 



60 
120 
180 
'240 
300 
360 
420 
4fi0 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 



1 11 21 31 41 51 

I I 1 I I i 

MDLQGRGVPS IDRLRVLLML FHTMAQIJIAE QBVWhSGLS TOPEKDIFW RENGTTCIWA 
EFAAKFIVPY DVWASNYVDL ITEQADIAIjT RGAEVKGRCG HSQSELQVFM VDRAYAUCML 
FVKESHNMSK GPEATWRLSK VQFVYDSSEK THPKDAVSAG KHTANSHHLS ALVTPAGKSY 
BOQAQQTISI, ASSDPQKTVT MILSAVHIQP FDIISDFVPS EBHKCPVDER EQtiEETIiPI.1 
XiGLIbGLVIM VTLAIYHVHa XMTAHQVQZP RDRSQYXHMG 

Seq ID KG: 640 DMA sequence 

Nucleic Acid Accession ft: NM_002993.l 

Coding sequence: 64.. 408 



60 
120 
180 
240 



TCTCTTGACC 
CTCCTTGTGC 
OGCTGGTCCT 
GAGAGTAAAC 
CAAGGTGGAA 
CCCTTTTCTA 
AACAAAAAAG 
TCCCTGGACC 
TCCCTACTTT 
TAATGAAGTA 
AACCCrTTGG 
AATATTGAAT 
AAGGCTGTGG 
TGTTGTTCTT 
AATATCTTAC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



431 



wo 02/086443 

TcrrracccT aggktgciat ttaagttgta ctgtattaga acactggcstg tgtgatacos 96o 

TTATCtGTGC AGAATATATT TCCTTATTCA GAATTTCTAA AAATTTAAGT TCTGTAAXSGG 1020 

CTAATATATT CTCTTCCTAT GGTTTTAGAT GTTTGATGTC TPCTTAGTAT GGC3VTAATCT 1080 

CATGATTTAC TCATTAAACT TTGATTTTGT ATGCTATTTT TTCACTAIAG GATGACTATA 1140 

ATTCTGGTCA CT AAATATAC ACTTTAGATA GATGAAGAAG CCCAAAAACA GATAAATTCC 1200 

TGATTGCTAA TTTACATAGA AATGTATTCT CTTGGTTTTT TAAATAAAAG CAAAATTAAC 1260 

AATGATCTGT GCTCTGCAAA GTTrRSAAAA TATATTTGAA CAATTTGAAT ATAAATTCAT 1320 

CATTTAGTCX TCAAAATATA TACAGCATTG CTAAGATTTT CAGATATCTA ITCTGGATCT 1380 

rrTAAAGGTT TTGACXy^TrT TGTTATGAC3G AATTATACAT GTATCACATT C ACTATAT TA 1440 

AAATTGCACT TTTAXTTTTT CClgTCTC TC ATGTTGGTrT TTGGTACrrG TATTGTCArT 1500 
TCGAGAAACA ATAAAAGATT TCT AAAOCRA AAAAAAAAAA AAAAAAA 



Seq ID KO: 641 Protein sequence 
Protein Accession S: B7P_002984.1 

1 11 21 

I \ I 

KSLPSSRAAR VFGPSGSLCA X«LAXiLLliLTP 
KTIGKLQVFP AGPQCSKVEV VASIiKNGKQV 



31 41 SI 

1 I i 

PGPLASAGPV SAVLTEXiRCT CLRVTLRVMP 
djDPEAPFLK KVIQKILDSO NKKN 



Seq ID KO: 642 DNA sequence 

Nucleic Acid Accession S: IIM_013271.1 

Coding sequence: 27.. 80 9 

1 11 21 31 4i SI 

I I i I i ^ 

TCCGGAGCCA GGCTOGCTGG GGCAGCATGG OSGGGTCGCC GCTGCTCTGG GGGCCGCGGG 60 

COSGGGGCGT CGGCCTTTTG GTGCTGCTGC TGCTCGGCCT GTTTCGGCCG CCCCCOGCGC 120 

TCrcCGCGCG GCCGGTAAAG GAACCCGGOG GCCTAAGCGC AGCGTCTCOG CCCTTQGCTG IBO 

AGACTOGOSC TCCTCGCCGC TTCCX3GCGGT CAGTGCCCCG AGGTGAGGCG GCGGGGGCX3G 240 

TCCAGGAGCT GGCGCGGGCG CTGGCGCATC TGCTGGAGGC CGAACGTCAG GAGCGGGCGC 300 

GGGCCGAGGC GCAGGAGGCT GAGGATCAGC AGGOGOGOGT CCTGGCGCAG CTGCTGCGOG 360 

TCTCGGGOGC CCCCCGCRAC TCTGATCCGG CTCTGGGCCT GGACGACGAC CCCGACGOGC 420 

CroCAGCGCA GCTOGCTOGC GCTCTGCTCC GC3GCCCQCCT TGACCCTGCC GCXCTAGCAG 480 

CCCAGCTTGT CCCOGCGCCC GTCCCCGCOS CGGCGCTCCG ACC(XGGCCC CCGGTCTACG S40 

ACGAOSGCCC CGCGGGCCCX3 GATGCTGAGG AGGCAGGCGA CGAGACACCC GACGTGGACC 600 

CCGAGCTGTT GAGGTACTTG CTGGQACGGA TTCTTGCGGG AAGCGCGGAC TCCGAGGGGG 660 

TGGCAGCCCC GCGCCGCCTC CGCCGTGCCG COGACCAOGA TGTGGGCTCT GAGCTGCCCC 720 

CTOAGGGCGT GCTGGGGGCG CTGCTGCGTG TGAAACGCCT AGAGACCCCG GQGCCCCAGG 780 

TCCCTGCACG COGCCTCTTG CCACCCTGAG CACTGCCCXX3 ATCCCGTGCA CCCTGGGACC 840 

CAGAA6TGCC CCCGCCATCC CX3CCACCAGG ACTTCTCCCC GCCAGCACGT CCAGAGCAAC 900 

TTACCCCGGC CA6CCAGCCC TCTCACCCGA GGATCCCTAC <XCCTOGCCC ACAATAACAT 960 
GATCTGAGC 



Seq ID NO: 643 Protein sequence 
Protein Accession 6: NP_^037403.1 

1 11 21 31 41 51 

MAGSPLIiWGP RAGGVGLLVL LLLGLFRPPP ALCARPVKEP RGLSAASPPL AETGAPRRPR 60 

RSVPRGEAAG AVQELARALA HLLEAERQER AKAEAQEAED QQARVLAQLL RVWGAPRNSD 120 

PALGLDDDPD APAAQLAHAL lAARLDPAAL AAQLVPAPVP AAALRPHPPV YDDGPAGPDA 180 

EEAGDETPDV DPELLRYLLG RILA6SADSE GVAAPRRLRR AADHDV6SEI, PPBGVLGALL 240 
RVKRLETPAP QVPARRLLPP 



Seq ID NO: 644 DNA sequence 
Nucleic Acid Accession ft: NM_002214 
Coding sequence: 681.. 2990 

1 11 21 31 41 SI 

CCCAGAGCCG CCTCCCCCTG TTGCTGGCAT CCCGAGCTTC CTCCCTTGCC AGCCAGGAOG 60 

CTGCCX3ACTT GTCTTTGCCC GCTGCTCCGC AGACGGGGCT GCAAAGCTGC AACT AATGGT 120 

GTTGGCCTCC CTGCCCACCT GTGGAAGCAA CTGCGCTGAT TGATGCX3CCA CAGACTTTTT 180 

TCCCCTCGAC CTCGCOGGCG TACCCTCCCA CAGATCCAGC ATCACCCAGT GAATGTACAT 240 

TAGOGTXWTT TCCCCCCCAG CTTCGGGCTT TGTTTGGGTT TGATTGTGTT TGGCTCTTCG 300 

CTAAGCTGAT TTATGCAGCA GAAGCCCCAC CGGCTGGAGA GAAACAAAAG CTCTTTTCTT 360 

TGTCCCGGAG CAGGCTGCGG AGCCCTTGCA GAGCOCTCTC TCCAGTOGCC 6CCGGGCCCT 420 

TGGCCGTCGA AGGAGGTGCT TCTCGCGGAG ACCXWGGGAC CCGCCGTGCC GAGCC6GGAG 480 

GGCCXSTAGGG GCCCTGAGAT GCCGAGCGGT GCCCGGGCCC GCTTACCTGC ACOSCTTGCT 540 

CXX3AGCCGCG GGGTCCGCCT GCTAGGCCTG CGGAAAAOGT CCTAGCGACA CTCGCC0G03 600 

GGCCCCGAGG TCGCCCGGGA GGCCGAGCCC GCGTCOGGAA GGCAGCCAGG CGGCGGGCGC 660 

GGGGCGGGCT GTTTTGCATT ATGTGOGGCT CGGGCCTGGC TTTTTTTACC GCTGC ATTTG 720 

TCTGCCTGCA AAACGACCGG CGAG6TCCCG CCTCGTTCCT CTGGGCAGCC TGGGTGTTTT 780 

CACTTGTTCT TCGACTGGGC CAAGGTGAAG ACAATAGATG TGCATCTTCA AATGCAGCAT 840 

CCTGTGCCAG CTGCCTTGCG CTGGGTCCAG AATGTGGATG GTGTGTTCAA 6AGGATTTCA 900 

TTTCAGGTGG ATCAAGAAGT GAACGTTGTG ATATTGTTTC CAATTTAATA AGCAAAGGCT 960 

GCTCAGTTCA TTCAATAGAA TACCCATCTG TGCATGTTAT AATACCCACT GAAAATGAAA 1020 

TTAATACCCA GGTGACACCA GGAGAAGTGT CTATCCAGCT GCGTCCAGGA GCCGAAGCTA 1080 

ATTTTATGCT GAAAGTTCAT CCTCTGAAGA AATATCCTGT GGATCTTTAT TATCTTGTTG 1140 

ATGTCTCAGC ATCAATGCAC AATAATATAG AAAAATTAAA TTCOGTTGGA AAC GATTT AT 1200 

CTAGAAAAAT GGCATTTTTC TCCCGTGACT TTCGTCTTGG ATTTGGCTCA TACGTTGATA 1260 

AAACAGTTTC ACCATACATT AGCATCCACX: CCGAAAGGAT TCATAATCAA TGCAGTGACT 1320 

ACAATTTAGA CTGCATGCCT CCCCATGGAT ACATCCATGT GCTGTCTTTG ACAGAGAACA 1380 

TCACTGAGTT TGAGAAAGCA GTTCATAGAC AGAA6ATCTC TGGAAACATA GATACACCAG 1440 

AAGGAGGTTT TGAOGCCATG CTTGAGGCAG CT6TCTGTCA AAGTCATATC GGA TCGO GAA ISOO 

AAGAGGCTAA AAGATTGCTG CTGGTGATGA CACATCAGAC GTCTCATCTC GCTCTTGATA 1560 
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GCAAATTGGC AGGCATAGTG GTGCCCAATG ACGGAAACTG TCATCTGAAA AACAACGTCT 1620 

AOGTCAAATC GACAACCATG GAACACOCCT CACTAGGCCA ACmCAGAC AAATTAATAG 1680 

ACAACAACAT TAAIGTCATC TTTGCAGTTC AAGGAAAACA ATTTCATTOG TATAAOSATC 1740 

TTCTACCCCT CTTGCCAGGC ACCATTGCTG GTGAAATAGA ATCAAAGCCT GCAAACCTCA ■ 1800 

ATAATTTGGT AGTCGAAGCC TATCAGAAGC TCATTTCAGA AGIGAAAGTT CAGGTGGAAA 1860 

ACCAGGTACA AGGCATCTAT TTTAACATTA CCGCCATCTG TCCA GATGGG TCCAGAAAGC 1920 

CAQGCATGGA AGGATGCAGA AACGTGACGA GCAATGATGA AGTTCTTTTC AATGTAACAG 1980 

TTACAATCAA AAAATCTGAT GTCACAGGAG GAAAAAACTA TCCAATAATC AAACCTATTG 2040 

GTTTTAATCA AACCGCTAAA ATTCATATAC ACftGAAACTG CAGCTGTCAC TGTGAGGACA 2100 

ACAGAGGACC TAAAGGAAAG TGTGTAGATG AAACnTTCT AGATTCCAAG TGTTTCCAGT 2160 

GTGATGAGAA TAAATGTCAT TTTGATGAAG ATCAGTTTTC TTCTGAGAGT TGCAAGTCAC 2220 

ACAAGGATCA GCCTGTTTGC AGTOGTCGAG GAGTTTGTGT TTGTGGGAAA TGTTCATG TC 2280 

ACAAAATTAA GCTTCGAAAA GTGTATGGAA AATACTGTGA AAAGGATGAC TTTTCTTGTC 2340 

CATATCACCA TGGAAATCTG TGTGCTGGGC ATGGAGAGTC TGAAGCAGGC AGATGCCAAT 2400 

GCTTCAGTGG CTGGGAAGGT GATCX3ATGCC AGTGCCCTTC AGCAGCAGCC CAGCACTGTG 2460 

TCAATTCAAA GGGCCAAGTG TCCAGTGGAA GAGGCaOCTG 'Ugltn CTOGA AGGTGTGAGT 2520 

GCACCGATCC CACGAGCATC GGCCGCTTCT GTGAACACTG CCCCACCTGT TATACAGCCT 2580 

GCAAGGAAAA CTCGAATTGT ATGCAATGCC TTCACCCTCA CAATTTCTCT CAGGCTATAC 2640 

TTGATCAGTC CAAAACCTCA TGTGCTCTCA TGGAACAACA GCATTATCTC GACCAAACTT 2700 

CAGAATCTTT CTCCAGCCCA AGCTACTTGA GAATATTTTT CATCATTTTC ATAGTTACAT 2760 

TCTTCATIGG GTOCTTAAA GTCCTGATCA TTAGACAGGT GATACTACAA TGGAATAGTA 2820 

ATAAAATTAA GTCCTCATCA GATTACAGAG TGTCAGCCTC AAAAAAGGAT AAGTTGATTC 2880 

TCCAAAGTCT TTGCACAAGA GCAGTCACCT ACCGACGTGA GAAGOCTGAA GAAATAAAAA 2940 

TGGATATCAG CAAATTAAAT GCTCATGAAA CTTTCAGGTG CAACTTCTAA AAAA AGATT T 3000 

TTAAACACTT AATGGGAAAC TGGAATTGTT AATAATTGCT CC TAAAGATT ATAATTTTAA 3060 

AAGTGACAGG AGGAGACAAA TTGCTCACGG TCATGCCAGT TGCTGGTTGT ACACTCGAAC 3120 

GAAGACTGAC AAGTATCCTC ATCATGATGT >SACTCACATA GCTGCTGACT TTTTCAGAGA 3180 

AAAATCTGTC TTACTACTGT TTGAGACTAG TGTCGTT6TA GCACTTTACT G TAATA TATA 3240 

ACTTATTTAG ATCAGCATAG AATGTA(SATC CTCTGAAGAG CACTGATTAC ACTTTA^G 3300 

TACCTGTTAT CCCTACGCTT CCCAGAGAGA ACAATGCTGT GAGAGAGTTT AGCATTOTGT 3360 

CACTACAAGG GTACAGTAAT CCCTGCACTG GACATGTGAG GAAAAAAATA ATCTGGCAAG 3420 

TATATTCTAA GGTTGCCAAA CACTTCAACA GTTGGTGGTT GAATAGACAA GAACAGCTAG 3480 

ATGAATAAAT GATTOGTGTT TCACTCTTTC AAGAGGTGAA CAGATACAAC CTTAATCTTA 3540 

AAAGATTATT GCTTTTTAAA GTCTGTAGTT TTATGCATGT GTGTTTATGG TTTGCTTATT 3600 

TTTCCAAGAT GGATACTAAT TCCAGCATTC TCTCCTCTTT GCCTTTATGT TTTGTTTTCT 3660 

TTTTTACAGG ATAAGTTTAT GTATGTCACA GATGACTOGA TTAATTAAGT GCTAAGTTAC 3720 

TACTGCCATA AAAAACTAAT AATACAATGT CACTTTATCA GAATACTAGT TmAAAGCT 3780 
GAATGTTAA 



Seq ID NO: 645 Protein sequence 
Protein Accession i: NP_002205 

1 11 21 31 41 51 

I I I I i ^ 

MCGSALAFFT AAPVCUJNDR RGPASPLWAA WVPSLVLGLG QGEDNRCASS NAASCARCLA 60 

LGPECGWCVQ EDFISGGSRS ERCDIVSNLI SKGCSVDSIE YPSVHVIIPT EMBINTQVTP 120 

GEVSIQLRPG AEANFMLKVH PI.KKYPVDLY VLVDVSASMH NNIEKLNSVG NDLSftKMAPF ISO 

SRDFRLGFGS YVDKTVSPYI SIHPERIHNQ CSDYNLDCMP PHGYIHVLSL TENITEPEKA 240 

VHRQKISGMI DTPEGGFDAM LQAAVCESHl GWRKEAKRLL LVMTDQTSHL ALDSKLAGIV 300 

VPUDGNCHLK NNVYVKSTTM EHPSLGQLSE KLIDNNINVI FAVQGKQFHW YiOSLLPLLPG 360 

TIAGEIBSKA ANLNNLWEA YQKLISEVKV QVEKQVQGIY FNITAICPDG SRKPGPffiGCR 420 

NVTSNDEVLF NVTVTMKKCD VTGGKNYAII KPIGFNETAK IHIHRNCSCQ CEDNRGPKGK 480 

CVDETFU)SK CFQCDENKCH FDSDQFSSES CKSHKDQPVC SGRGVCVCGK CSCHKIKIXSK 540 

VYGKYCEKDD FSCPYHHOJL CAGHGECBAG ROQCFSGWEG DRCQCPSAAA QHCVNSKGQV 600 

CSGRGTCVCG RCECTDPRSI GRPCEHCPTC YTACXENWMC MQCLHPHNLS QAILDQCKTS 660 

CALMEQQHYV DQTSECPSSP SYLRIPFIIP IVTFLIGIiLK VLIIRQVIIjQ WMSNKIKSSS 720 
DYRVSASKKD KLILQSVCTR AVTYRREKPE &ZKKDISKLN AHETFRCNP 



Seq ID MO: 646 DNA sequence 

Nucleic Acid Accession ft: NM_003318.1 

Coding sequence : 1 . . 2574 

1 11-21 31 41 51 

ATGGAATCCG AGGATTTAAG TGGCAGAGAA TTGACAATTG ATTCCATAAT GAACAAAGTG 60 

AGAGACATTA AAAATAAGTT TAAAAATGAA GAOCTTACTG ATGAACTAAG CTTGAATAAA 120 

ATTTCTGCTG ATACTACAGA TAACTCGGGA ACTGTTAACC AAATTA TGAT GATGGCAAAC 180 

AACCCAGAGG ACTGGtTGAG T'rrG T TGC TC AAACTAGAGA AAAACAGTGT TCOGCTARgT 240 

GATGCTCTTT TAAATAAATT GATTGGTCGT TACAGTCAAG CAATTGAAGC 6CTTCCOCCA 300 

GATAAATATG GCCAAAATGA GAGTTTTGCT AGAATTCAAG TGAGATTTGC TGAATTAAAA 360 

GCTATTCAAG AGCCAGATGA TGCACGTGAC TACTTTCAAA TGGCCAGAGC AAACTGCAAG 420 

AAATTTGCTT TTGTTCATAT ATCTTTTGCA CAATTTGAAC TGTCACAAGG TAATGTCAAA 480 

AAAAGTAAAC AACTTCTTCA AAAAGCIGTA QAAOJIGGAG CAGTACCACT AGAAATGCTG 540 

OAAATTGCCC TGCGGAATTT AAACCTCCAA AAAAAGCAGC TGCTTTCAGA GGAGGAAAAG 600 

AAGAATTTAT CA6CATCTAC GGTATTAACT GCCCAAGAAT CATTTTCCGG TTCA CTTGGG 660 

CATTTACAGA ATAGGAACAA CAGTTGTGAT TCCAGAGGAC AGACTACTAA AGOCAGGTTT 720 

TTATATGGAG AGAACATGCC ACCACAAGAT GCAGAAATAG GTTACCGGAA TTCATT GAGA 780 

CAAACTAACA AAACTAAACA GTCATGCCCA TTTGGAAGAG TCCXAGTTAA CCTTCTAAAT 840 

AGCCCAGATT GTGATGTGAA GACAGATGAT TCAGTTGTAC CTTGTTTTAT 6AAAAGACAA 900 

ACCTCTAGAT CAGAATGCCG AGATTTQGTT GTGCCTG6AT CTAAACCAAfi TGGAAAT6AT 960 

TCCTGTGAAT TAAGAAATTT AAAGTCTGTT CAAAATA6TC ATTTCAAGGA ACCTCTGGTG 1020 

TCAGATGAAA AGAGTTCTGA ACTTATTATT ACTGATTCAA TAACCCTGAA GAATAAAAOG 1080 

GAATCAAGTC TTCTAGCTAA ATTAGAAGAA ACTAAAGAGT ATCAAGAACC AGAGGTTCCA 1140 

GAGAGTAACC AGAAACAGTG GCAATCTAAG AGAAAGTCAG AGTGTATTAA CCAGAATCCT 1200 

GCTGCATCTT CAAATCACTG GCAGATTCCG GAGTTAGCCC GAAAAGTTAA TACAGAGCAG 1260 

AAACATACCA CTTTTQAGCA ACCTGTCTTT TCAGTTTCAA AACAGTCACC AC CAATA TCA 1320 

ACAICTAAAT GGTTTGACCC AAAATCTATT TGTAAGACAC CftAGCftGCaA TftCCTTGGAT 1380 
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GATTJVCATGA GCTCTTTTAG AACTCCAGTr GTAAflGAATG ACTTTCCACC TGCTTGTCAC 1440 

TTCTCAACAC CTTATGGCCA ACCTGCCTGT TTCCAGCAGC AACAJGCATCA AATACTTGCC ISOO 

ACroCaCTTC AAAATTTACA GGTTTTAGCA TCTTCTTCAG CAAATCAATG CATTTaSGTT 1560 

AAAGGAACAA TTTATTCCAT TTTAAAGCAG ATAGGAAC3TG GAGGTTCAAG C3KAGGTATTT 1620 

CaGGTGTTAA ATGAAAAfiAA ACAGATATAT GCTATAAAAT ATGTGAACTT AGAAGAA6CA 1680 

GATAACCAAA CTCTTCATAG TTACOGGAAC GAAATAGCTT ATTTGAATAA ACTACAACAA 1740 

CACAGTGATA AGATCATCCG ACTTTATGAT TATGAAATCA CGGACCAGTA CATCTACATG 1800 

GTAATCGACT GTGGAAATAT TGATCTTAAT AGTrGfeCTTA AAAAGAAAAA ATOCATTCAT I860 

CCATC3GGAAC GCAAGAGTTA CTGGAAAAAT ATCTTAGAGG CAGTTCACAC AATCCATCAA 1920 

CATOGCATTG TTCACAGTGA TCTTAAWXA GCTAACTTTC TGATAGTTGA TGC5AATGCTA 1980 

AAGCTAATTC ArmXSGGAT TX3CAAACCAA ATX3CAACCAC ATACAACAAG TGTTGTTAAA 2040 

GATTCTCAGG TTGGCACAGT TAATTATATG CCACCAGAAG CAATCAAAGA TATGTCTTCC 2100 

TCCAGAGAGA ATCGGAAATC TAAGTCAAAG ATAAGCCCCA AAAGTGATGT TTGCTCCPTA 2160 

GGATCTATTT TCTACTATAT GACTTACGGG AAAACACCAT TTCAGCAGAT AATTAATCAG 2220 

ATTTCTAAAT TACATGCCAT AATTGATCCT AATCATGAAA TTGAATTTCC CGATATTCCA 2280 

GAGAAAGATC TTCAAGATGT GTTAAAGTGT TGTTTAAAAA GGGACCCAAA ACAGAGGATA 2340 

TCCATTCCTC AGCTCCTGGC TCATCCCTAT CTTCAAATTC AAACTCATCC AGTTAACCAA 2400 

ATGGCCAAGG GAACCACTGA AGAAATGAAA TATGTTCTGG GCCAACTTGT TGGTCTGAAT 2460 

TCTCCTAACT CCATTTTGAA AGCTGCTAAA ACTTTATATG AACACTATAG TGGTGGTGAA 2S20 
AGTCATAATT CTTCATCCTC CAAGACTTTT GAAAAAAAAA GGGGAAAAAA ATGA 



"Seq ID KOi 647 Protein sequence 
Protein Accession #s NP_^003309.l 

1 11 21 31 41 51 

1111)1 

MESEDLSGRE LTIDSIMNKV RDIKNKPKNB DLTDELSLNK ISADTTDNSG TVKQIMMMAN 60 

NPEDMLSU.L KLEKNSVPLS DALLNKLIGR YSQAIEALPP DKYGCHESPA RIQVRPAELK 120 

AIQEPDDARD YFQMARANCK KPAFVHISPA QFELSQGNVK KSKQLLQKAV ERGAVPI^aOi 180 

EIALRNLNLQ iOCQLLSEBEK KNLSASTVLT AQESFSGSLG HLQ2fl2NNSCD SRGOTTKARF 240 

LYGBtlNPPQD ABIGYRNSLR QTNRTKQSCP PGRVPVSliLN SPDCDVKTDD SWPCFMKRQ 300 

TSRSECRDLV VPGSKPSGND SCELiaiLKSV QNSHFKEPLV SOEKSSELII TDSITUCNKT 360 

ESSLLAKLEE TKBYQEPEVP ESNQKQWQSK RKSECINQNP AASSNHWQIP ELARKVNTEQ 420 

KHTTFEQPVF SVSXQSPPIS TSKWFDPKSI CKTPSSNTLD DYMSCFRTPV VKNOFPPAOQ 480 

LSTPYGQPAC FQQQQHQILA TPLC3NLQVLA SSSANECISV KGRIYSILKQ IGS6GSSKVF 540 

QVUIBKKQIY AIKYVNLEEA DNQTLDSYRK EIAYUIKU5Q HSDKIIRLYD YEITDQYIYM 600 

VMECGBIIOLK SWLKKKKSID PWSRKSYWKN MLEAVHTIHQ HGIVHSDLKP ANPLIVDGML 660 

KLIDFGIANQ MQPDTTSWK DSQVGTVNYM PPEAIKDMSS SRENGKSKSK ISPKSDVWSl* 720 

GCILYYMTYG KTPF«)IINQ ISKLHAIIDP NHEIEPPDIP EKDLQDVLKC CLKROPKQRI 780 

SIPELLAHPY VQIQTHPVNQ MAKGTTEEMK YVLGQLVGLM SPNSILKAAK TLYEHYSGGB 840 
SHNSSSSKTF EKKRGKK 



Seq ID NO: 648 DHA sequence 
Nucleic Acid Accession ff; NM_015507 
Coding sequence: 241.. 1902 

1 11 21 31 41 51 

CCGCAGAGGA GCCTCGGCCA GGCTAGCCAG GGCGCCCCCA GCCCCTCCCC AGGCCGCGAG 60 

CGCCCCTGCC GCGGTGCCTG GCCTCCCCTC CCAGACTGCA GGGACAGCAC CCGGTAACTG 120 

OSAGTGGAGC GGAGGACCOG AGCGGCT6AG GAGAGAGGAG GCGGCGGCTT AGCTGCTACG ISO 

GGGTCCGGCX: GGCGCOCTCC OGAGGGGGGC TCAQGAGGAG GAAGGAGGAC CQGTGOCT^ 240 

ATGCCTCTGC CCTG6AG0CT TGOGCTCCOS CTGCTGCTCT CCTGGGTGGC AGGTGGTTTC 300 

GGGAACGCGG CCAGTGCAAG GCATCACGGG TTGTTAGCAT CGGCACGTCA GCCTGGGGTC 360 

TGTCACTATG GAACTAAACT GGCCTGCTGC TAOGGCTGGA GAAGAAACAG CAAGGGAGTC 420 

TGTGAAGCTA CATGCGAACC TGGATGTAAG TTTGGTGAGT GCGTGGGACC AAACAAATGC 480 

AGATCCTTTC CflGGATACAC OGGGAAAACC TGCAGTCAAG ATGTGAATGA GTGTGGAATG 540 

AAACCCCGGC CATGCCAACA CAGATGTGTG AATACACACG GAAGCTACAA GTGCTTTTGC 600 

CTCAGTGGCC ACATGCTCAT GCCAGAT6CT AOGTGTGTGA ACTCTAGGAC ATGTGCCATG 660 

ATAAACTGTC AGTACAGCTG TGAAGACACA GAA6AAGGGC CACAGTGCCT GtGTCCATCC 720 

TCAGGACTCC GCCTGGCCCC AAATGGAAGA GACTGTCTAG ATATTGATGA ATGTGCCTCT 780 

GGTAAAGTCA TCTGTCCCTA CAATCGAAGA TGTGTGAACA CATTTGGAAG CTACTACTGC 840 

AAATGTCACA TTGGTTTCGA ACTGCAATAT ATCAGTGGAC GATATGACTG TATAGATATA 900 

AATGAATGTA CTATGGATAG CCATACGTGC AGCCACCATG CCAATTGCTT CAATACCCAA 960 

GGGTCCTTCA AGTGTAAATQ CAAGCAQGOA TATAAAGGCA ATGGACTTOG GTGTTCTGCT 1020 

ATCCCTGAAA ATTCTGTGAA GGAAGTCCTC AGAGCACCPG GTACCATCAA AGACAGAATC 1080 

AAGAAGTTGC TTGCTCACAA AAACAGCATG AAAAAGAAGG CAAAAATTAA AAATGTTACC 1140 

CCAGAACCCA CCAGGACTCC TACCCCTAAG GTGAACTTGC AGCCCTTCAA CTATGAAGAG 1200 

ATAGTTTCCA GAGGCGGGAA CTCTCATGGA GGTAAAAAAG GGAATGAAGA GAAAATGAAA 1260 

GAGGGGCTTG AG6ATGAGAA AAGA6AAGAG AAAGCCCTGA AGAATGACAT AGAGGAGCGA 1320 

AGCCTGOGAG GAGATGTGTT TTTCCCTAAG GTGAATGAAG CAGGTGAATT CX3GCCTGATT 1380 

CTGGTCCAAA GGAAAGOSCT AACTTCCAAA CTGGAACATA AAGATTTAAA TATCTCGGTT 1440 

GACTGCAGCr TCAATCATGG GATCTGTGAC TGGAAACAGG ATAGAGAAGA TOA TTTT GAC ISOO 

TGGAATCCTG CTGATCX3AGA TAATGCTATT GGCTTCTATA TGGCAGTTCC GGCCTTGGCA 1560 

GGTCACAAGA AAGACATTGG CCGATTGAAA CTTCTCCTAC CTGACCTGCA ACCCXAAAGC 1620 

AACTTCTGTT TGCTCTTTGA TTACCGGCTC GCCGGAGACA AAGTCGGGAA ACTTCGAGTG 1680 

TTTGTGAAAA ACAGTAACAA TGCCCTGGCA TGGGAGAAGA CCACGAGTGA GGATGAAAAG 1740 

TG6AAGACA6 GGAAAATTCA GTTGTATCAA GGAACTGATG CTACCAAAAG CATCATTTTT 1800 

GAAGCAGAAC GTGGCAAGGG CAAAACCGGC GAAATOGCAG TGGATGGCGT CTTGCTTGTT 1860 

TCAGGCTTAT GTCCAGATAG CCTTTTATCT GTGGATGACT GAATGTTACT ATCTTTATAT 1920 

TTCACTTTGT ATGTCAGTTC CCTGGTTTTT TTGATATTGC ATCATAGGAC CTCTGGCATT 1980 

TTAGAATTAC TAGCTGAAAA ATTGTAATGT ACCAACAGAA ATATTATTGT AAGATGCCTT 2040 

TCTTQTATAA GATATGCCAA TATTTGCTTT AAATATCATA TCACTGTATC TTCTCAGTCA 2100 

TTTCIGAATC TTTCCACATT ATATTATAAA ATATGGAAAT GTCAGTTTAT CTCCCCTCCT 2160 

CAGTATATCT GATTTGTATA AGTAAGTTGA TGAGCTTCTC TCTACAACAT TTCTAGAAAA 2220 

TAGAAAAAAA AGCACACAGA AATGTTTAAC TGTTTGACTC TTATCATACT TCTTGGAAAC 2280 

TATGACATCA AAGATAGACT TTTGCCTAAG TGGCTTAGCT GGGTCTTTCA TAGCCAAACT 2340 
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TGTATATTTA AATTCTTTGT AATAATAATA TCCAAATCAT CAAAAAAAAA AAAAAAAA 



Seq ID HO: £49 Protela sequence 
Protein Accession »P_056322 

1 11 21 31 41 51 

MPLPWSLALP LLLSMVACGF GNAASARHKG LLASARQICV CHYGTKLACC YGWRRKSKGV 60 

CEATCEPGOC FGECVGPUKC RCFPGYTGKT CSQDVNECGM KPRPOQHRCV UTHGSYKCFC 120 

LSGHMU4PDA TCVNSRTCAM IKCQYSCEDT BEGPQCLCPS SGLHIiAPNGR OCLDIDBCAS 160 

GKVICPVNRR CVNTFGSYYC KCHIGFBLQY ISGRYDCIDI NECTMDSHTC SHHAKCPNTQ 240 

GSFKCKCKQG YKSStGVRCSA IPENSVKEVIi RAPGTIKDRI KKLLAHKNSM KKKAKIKNVT 300 

PEPTRTPTPK VMLQPFNYEB IVSKGGNSHG GKKGNSEKMK EGLEDEKREB KALKNDIEER 360 

S1JIG0VFFPK VNEAGEFGLI LVQRKALTSK LBHKDLWISV DCSFNHGICD WKQDREDDPD 420 

WNPADRDNAI GPYMAVPALA GHKKDIGRUC LLLPDLQPQS HFCIiLPBYRL AODiCVGKLRV 480 

FVKNSNNALA MEKTTSEDEK MKTGKIQLYQ GTOATKSIIF EAERGKGKIG BIAVDGVLI.V 540 
SGLCPDSLLS VDO 

Seq ID KO: 650 DNA sequence 

Nucleic Acid Accession fi: im__003506.l 

Coding sequence: 259.. 2379 

I 11 21 31 41 51 

GCAGCTCCAG TCCCGGACGC AACCCCGGAG COGTCTCAGG TCCCTGGGGG GAACGGTGGG 60 

TTAGAOGGGG ACGGGAAGGG ACAGCGGCCT TOGACGGCXX: CCOGAGTAAT TCACCCAGGA 120 

CTGATTTTCA GGAAAGCCTC AAAATGAGTA AAATAGTGAA ATGAGGAATT TGAACATTTT 180 

ATCTTTGGAT GGGGATCTTC TGAGGATGCA AAGAtHGATT CATCCAAGCC ATGTGGTAAA 240 

ATGAGGAATT TGAAGAAAAT GGAGATGTTT ACATTTTTGT TGACGTGTAT TTTTCTACOC 300 

CTCCTAAGAG GGCACAGTCT CTTCACCTGT GAACCAATTA CTGTTCCCAG ATGTATGAAA 360 

ATGGCCTACA ACATGACGTT TTTCCCTAAT CTGATGGGTC ATTATGACCA GAGTATTGCC 420 

GC3GGTGGAAA TGGAGCATTT TCTTCCTCTC GCAAATCTGG AATGTTCACC AAACATTGAA 480 

ACTTTCCTCT GCAAAGCATT TOTACCAACC TGCATAGAAC AAATTCATGT GGTTCCACCT 540 

TtSTCGTAAAC TTTGTGAGAA AGTATATTCT GATTGCAAAA AATTAATTGA CACTTTrGGG 600 

ATCCGATGGC CTGAGGAGCT TGAATGTGAC AGATTACAAT ACTGTGATGA GACTGTTCCT 660 

GTAACTTTTG ATCCACACAC AGAATTTCTT GGTCCTCAGA AGAAAACAGA ACAAGTCCAA 720 

AGAGACATTG GATTTTGGTG TCCAAGGCAT CTTAAGACTT CTGGGGGACA AG6ATATAAG 780 

TTTCTGGGAA TTGACCAGTG TGCGCCTCCA TGCCCCAACA TGTATTTTAA AACTGATGAG 840 

CTAGAGTTTG CAAAAAGTTT TATTGGAACA GTTTCAATAT TTTGTCTTTG TGCAACTCTG 900 

TTCACATTCC TTACTTTTTT AATTGATGTT ACAAGATTCA GATACCCAGA GAGACCAATT 960 

ATATATTACT CTGTCTGTTA CAGCATTGTA TCTCTTATGT ACTTCATTGG ATTTTTGCTG 1020 

GGCGATAGCA CAGCCTGCAA TAAGGCAGAT GAGAAGCTAG AACTTGGTGA CACTGTTGTC 1080 

CTAGGCTCTC AAAATAAGGC TTGCACCGTT TTGTTCATGC TTTTGTATTT TTTCACAATG 1140 

GCTGGCACTG TGTGGTGGGT GATTCTTACC ATTACTTGGT TCTTAGCTGC AGGAAGAAAA 1200 

TGGAGTTGTG AAGCCATCGA GCAAAAAGCA GTGTGGTTTC ATGCTGTTGC ATGGGGAACA 1260 

CCAGGTTTCC TGACTGTTAT GCTTCTTGCT CTGAACAAAG TTGAAGGAGA CAACATTAGT 1320 

GGAGTTTGCT TTGTTGGCCT TTATGACCTG GATCCTTCTC GCTACTTTGT ACTCTTGCCA 1380 

CTGTGCCTTT GTGTCTTTGT TGGGCTCTCT CTTCTTTTAG CTGGCATTAT TTCCTTAAAT 1440 

CATGTTCGAC AAGTCATACA ACATGATGGC OGGAACX»AG AAAAACTAAA 0AAATTTAT6 ISOO 

ATTCGAATTG GAGTCTTCAG CGGCTTGTAT CTTGTGCCAT TAGTGACACT TCTOGGATGT 1560 

TACGTCTATG AGCAAGTGAA CAGGATTACC TGGGAGATAA CTTGGGTCTC TGATCATTGT 1620 

CGTCAGTACC ATATCCCATG TCCTTATCAG GCAAAAGCAA AAGCTCGACC AGAATTGGCT 1680 

TTATTTATGA TAAAATACCT GATGACATTA ATTGTTGGCA TCTCTGCTGT CTTCTGGGTT 1740 

GGAAGCAAAA AGACATGCAC AGAATGGGCP GGGTTTTTTA AACGAAATCO CAAGAGAGAT 1800 

CCAATCAGTG AAAGTCGAAG AGTACTACAG GAATCATGTG AGTTTTTCTT AAAGCACAAT I860 

TCTAAAGTTA AACACAAAAA GAAGCACTAT AAACCAAGTT CACACAAGCT GAAGGTCATT 1920 

TCCAAATCCA TGGGAACCAG CACAGGAGCT ACAGCAAATC ATGGCACTTC TGCAGTAGCA 1980 

ATTACTAGCC ATGATTACCT AGGACAAGAA ACTTTGACAG AAATCCAAAC CTCACCAGAA 2040 

ACATCAATGA GAGAGGTGAA AGCGGACGGA GCTAGCACCC CCAGGTTAAG AGAACAGGAC 2100 

TGTGGTGAAC CTGCCTOGCC AGCAGCATCC ATCTCCAGAC TCTCTGGGGA ACAGGTrCGAC 2160 

GGGAAGGGCC AGGCAGGCAG TGTATCTGAA AGT6CG0GGA GTGAAGGAAG GATTAGTCCA 2220 

AAGAGTGATA TTACTGACAC TGGCCTGGCA CAGAGCAACA ATTTGCAGGT CCCCAGTTCT 2280 

TCAGAACCAA GCAGCCTCAA AGGTTCCACA TCTCTGCTTO TTCACCCAGT TTCAGGAGTG 2340 

AGAAAAGAGC AGGGAGCTGG TTGTCATTCA GATACTTGAA GAACATTTTC TCTCGTTACT 2400 

CAGAAGCAAA TTTGTGTTAC ACTGGAAGTG ACCTATGCAC TGTTTTGTAA GAATCACTGT 2460 

TACGTTCTTC TTTTGCACTT AAAGTTGCAT TGCCTACTGT TATACTGGAA AAAATAGAGT 2520 

TCAAGAATAA TATCACTCAT TTCACACAAA GGTTAATGAC AACftATATAC CTORAAACAG 2580 

AAATGTGCAG GTTAATAATA TTTTTTTAAT AGTGTGGGAG GACAGAOTTA GAGGAATCTT 2640 

CCTTTTCTAT TTATGAAGAT TCTACTCTTG GTAAGAGTAT TTTAAGATGT ACPATGCTAT 2700 

TTTACCTTTT TGATATAAAA TCAAGATATT TCTTTGCTGA AGTATTTAAA TCTTATCCTT 2760 

GTATCTTTTT ATACATATTT GAAAATAAGC TTATATGTAT TTGAACTTTT TTGAAATCCT 2820 

ATTCAAGTAT TTTTATCATG CTATTGTGAT ATTTTAGCAC TTTGGTAGCT TTTACACTGA 2880 

ATTTCTAAGA AAATTGTAAA ATAGTCTTCT TTTATACTGT AAAAAAAGAT ATACCAAAAA 2940 

GTCTTATAAT AGGAATTTAA CTTTAAAAAC CCACTTATTG ATACCTTACC ATCTAAAATG 3000 

TGTGATTTTT ATAGTCTOGT TTTAGGAATT TCACAGATCT AAATTATGTA ACTGAAATAA 3060 

GGTGCTTACT CAAAGAGTGT CCACTATTGA TTGTATTATG CTGCTCSUrTG ATOCTTCTGC 3120 

ATATTTAAAA TAAAATGTCC TAAAGGGTTA GTAGACAAAA TGTTAGTCTT TTGTATATTA 3180 

GGCCAAGTGC AATTGACTTC CCTTTTTTAA TGTTTCATGA CCACCCATTG ATTGTATTAT 3240 

AACCACTTAC AGTTGCTTAT ATTTTTTGTT TTAACTTTTG TTTCTTAACA TTTAGAATAT 3300 
TACATTTTGT ATTATACAGT ACCTTTCTCA GACATTTTGT AG 

Seq ID HOi 651 Protein sequence 
Protein Accession #: HP_003497.l 

1 II 21 31 41 51 

I 1 I 1 i I 
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wo 02/0S6443 

MEMFTPLLTC IFLPtLRCaS LFTCEPITVP RCKKMAYNMT FPPNUffiHYD QSIAAVEMEH 60 

PLPLAIJLECS PNIETFLCKA FVPTCIEQIH WPPOHOiCB KVYSUaaOiI DTPGIRSfPEE 120 

LSCDRLQYCD ETVPVTFDPH TSFLGPQKKT GQVQBDIGFW CPaHXiKTSGG QGYKFUSIOQ 180 

CAPPCPNMYF KSDELEFAKS FIGTVSIFCL CATLFTFLT? LIOVBRFRYP ERPIIYYSVC 240 

YSIVSUWPI GFLLGDSTAC NKADEKLBLG DTWIiGSQNK ACTVLFMI.LY FFTMAGTWW 300 

VILTITWFIA AGRKWSCEAI EQKAVWFKAV AWGTPGFLTV MLLALSKVEG DSISGVCPVG 360 

LYDU3ASRYP VLLPLCLCV? VGLSIiLAGI ISLIJHVRQVI QHDGKKQEICL KKFMIRIC5VP 420 

SGLYLVPbVT LLGCYVYEQV HRITWBITOV SDHCSIQYHIP CPYQAKAKAR PBIALF MIKY 480 

LKTLIVGISA VFWVGSKKTC TBWAGPFKRN HXRDPISSSR RV«JESCBPP LKHSSKVKHK 540 

KKHYKPSSHK UCVISKSMGT STGATANHGT SAVAITSHDY LGQSTLTEIQ TSPETSMRBV 600 

KADGASTPRL REQDO^AS PAASISRLSG EQVDGKGQAG SVSESARSEG RISPXSDITD 660 
tGLAQSKNLQ VPSSS^SSL KGSTSLLVHP VSGVRKSQGG GtBSDT 

fieq ID KO: 652 DMA sequence 

Nucleic Acid Accession ff: NM_014791.1 

Oodiog sequezice: 171.. 2126 

1 11 21 31 41 SI 

) I 1 i i i 

TTGGCGGGCG GAAGCGGCCA CAACCCGGCG ATOGAAAAGA TTCTTAGGAA CGCCGTACCA 60 

GCCGOSTCTC TCAGGACAfiC AGGCCCCTGT CCTTCTGTC6 GG06C06CTC AGCOGTGCCC 120 

TCCCCCCCTC AGGTTCTTTP TCTAATTCCA AATAAACTTG CAAGAGGACT AT GAAAG ATT 180 

ATGATGAACT TCTCAAATAT TATGAATTAC ATGAAACTAT TGGGACAGGT GGCTTTGCAA 240 

AGGTCAAACT TGCCTGCCAT ATCCTTACTG GAGAGATGGT AGCTATAAAA ATCATGGATA 300 

AAAACACACT AGGGAGTGAT TTGCCCCGGA TCAAAACGGA GATTGAGGCC TTGAAGAACC 360 

TGAGACATCA GCATATATGT CAACTCTACC ATGTGCTAGA GACAGCX^UVC AAAATATTCA 420 

TGGTTCTTCA GTACTGCCCT GGAGGAGAGC TGTTTGACTA TATAATTTCC CAGGATCGCC 480 

TGTCAGAAGA GGAGACCCX3G GTTGTCTTCC GTCAGATAGT ATCTGCTGTT GCTTATGTGC 540 

ACAGCCAOjG CTATGCTCAC AGGGACCTCA AGOCAGAAAA TTTGCTGTTT GATGAATATC 600 

ATAAATTAAA GCTGATTGAC TTTGGTCTCT GTGCAAAACC CAAGGGTAAC AAGGATTACC 660 

ATCTACAGAC ATGCTGTGGG AGTCTGGCTT ATGCAGCACC TGAGTTAATA CAAGGCAAAT 720 

CATATCTTGG ATCAGAGGCA GATGTTTGGA GCATGGGCAT ACTGTTATAT GTTCTTATGT 780 

GTCGATTTCT ACCATTTCAT 6AT6ATAATG TAATGCCTTT ATACAAGAAG ATTATGAGAG 840 

GAAAATATGA TGTTCCCAAG TGGCTCTCTC CCAGTAGCAT TCTGCTTCTT CAACAAATGC 900 

TGCAGGTGGA CCCAAAGAAA OGGATTTCTA TGAAAAATCT ATTGAACCAT CCCTGGATCA 960 

TGCAAGATTA CAACTATCCT GTTGAQTGGC AAAGCAAGAA TCCTTTTATT CACCTCGATG 1020 

ATGATTGCGT AACAGAACTT TCTGTACATC ACAGAAACAA CAGGCAAACA ATGGAGGATT 1080 

TAATTTCACT GTGGCAGTAT GATCACCTCA CGGCTACCTA TCTTCTGCTT CTAGCCAAGA 1140 

AGGCTCGGGG AAAACCAGTT CGTTTAAG6C TTTCTTCTTT CTCCTGTGGA CAAGCCAGTG 1200 

CTACCCCATT CACAGACATC AAGTCAAATA ATTGGAGTCT GGAAGATGTG ACCGCAAGTG 1260 

ATAAAAATTA TGTGGCGGGA TTAATAGACT ATGATTGGTG TGAAGATGAT TTATCAACAG 1320 

GTGCTGCTAC TCCCCGAACA TCACAGTTTA CCAAGTACTG GACAGAATCA AATGGGGTGG 1380 

AATCTAAATC ATTAACTCCA GCCTTATGCA GAACACCTGC AAATAAATTA AAGAACAAAG 1440 

AAAATCTATA TACTCCTAAG TCTGCTGTAA AGAATGAAGA GTACTTTATG TTTCCTGAGC 1500 

CAAAGACTCC AGTTAATAAG AACCAGCATA AGAGAGAAAT ACTCACTACG CCAAATOGTT 1S60 

ACACTACACC CTCAAAAGCT AGAAACCAGT GCCTGAAAGA AACTCCAATT AAAATACCAG 1620 

TAAATTCAAC AGGAACAGAC AAGTTAATGA CAGGTGTCAT TAGCCCTGAG AGG0GGT6CC 1680 

GCTCAGTGGA ATTGGATCTC AACCAAGCAC ATATGGAQGA GACTCCAAAA AGAAAGGGAG 1740 

CCAAAGTGTT TGGGAGCCTT GAAAGGCGGT TGGATAAGGT TATCACTGTG CTCACCAGGA 1800 

GCAAAAGGAA GGGTTCTGCC AGAGACGGGC CCAGAAGACT AAAGCTTCAC TATAATGTGA 1860 

CTACAACTAG ATTAGTGAAT CCAGATCAAC TGTTGAATGA AATAATGTCT ATTCTTCX^A 1920 

AGAAGCATGT TGACTTTGTA CAAAAGGGTT ATACACTGAA GTGTCAAACA CAGTCAGATT 1980 

TTGGGAAAGT GACAATGCAA TTTGAATTAG AAGTGTGCCA GCTTCAAAAA CCCGATGTGG 2040 

TGGGTATCAG GAGGCAGCGG CTTAAGGGOS ATGCCTGGGT TTACAAAAGA TTAGTGGAAG 2100 

ACATCCTATC TAGCTGCAAG GTATAATTGA TGGATTCTTC CATCCTGCCG GATGAGTGTG 2160 

GGTGTGATAC AGCCTACATA AAGACTGTTA TGATCGCTTT GATTTTAAAG TTCATTGGAA 2220 

CTACCAACTT GTTTCTAAAG AGCTATCTTA AGACCAATAT CTCTTTGTTT TTAAACAAAA 2280 

GATATTATTT TGTGTATGAA TCTAAATCAA GCCCATCTGT CATTATGTTA CTGTCTTTTT 2340 

TAATCATGTG GTTTTGTATA TTAATAATTG TTGACTTTCT TAGATTCACT TCCATATGTG 2400 

AATGTAAGCT CTTAACTAT6 TCTCTTTGTA ATGTGTAATT TCTTTCTGAA ATAAAACCAT 2460 
TTGTGAATAT 

Seq ID NO: 653 Protein sequence 
Protein Accession ft: NP_0556a6.1 

1 11 21 31 41 51 

MKDYDELLKY YELHETIGTG GFAKVKLACH ILTGBMVAIK IMDKNTLGSD LPRIKTEIEA 60 

liKNLRHQHIC QLYHVUETAN KIFMVLEYCP GGELFDYIIS QDRLSEEETR WPRQIVSAV 120 

AYVHSQGYAH RDLKPENLLF DEYHKLKLID FGLCAKPKGN KDYHLQTCCJG SLAYAAPELI 180 

QGKSYLGSEA DVWSMGILLY VLMCGFLPPD DDNVMALYKK IMRGKYDVPK WLSPSSILLL 240 

QQMLQVDPKK RISMKNLLNH PWIMQDYNYP VEWQSKNPFI HLDDDCVTEL SVHHRKNRQT 300 

NEDLISLKQY DHLTATYLLL LAKKARGKPV RLRLSSFSG6 QASATPFTDI KSMNWSLBDV 360 

TASDKNYVAG LIDYDWCEDD LSTGAATPRT SQFTKYWTES NGVESKSIiTP AIiCRTPAHKL 420 

KNKENVYTPK SAVKMEEYPM FPEPKTPVNK NQHKREILTT PNRYTTPSKA RNQCLKETPI 480 

KIPVNSTGTD KLMTGVISPE RRCRSVELDL NQAHMEETPK RKGAKVPGSL ERGLDKVITV 540 

LTRSKSKG3A RDCPRRLKLH YNVTTTRLVN PDQLLNEIMS ILPKKHVDFV QKGYTUCCQT 600 
QSDFGKVTMQ FELEVCQLQK PDWGIRRQR LKGDASWYKR LVEDII.SSCK V 

Seq ID NO: 654 DNA sequence 
Nucleic Acid Accession #: NM_000582 
Coding sequence : 88 . . 990 

1 11 21 31 41 SI 

GCAGAGCACA GCATCGTCGG GACCAGACTC GTCTCAGGOC AGTTGCAGCC TTCTCAGCCA 60 

AAGGGGGACC AAGGAAAACT CACTACCATG AGAATTGCAG TGATTTGCTT TTGCCTCCTA 120 
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GGCATCACCT GTGCCATACX AGTTAAACAG GCTGATTCTG GAAGTrCTGA GGAAAAGCAG 180 

CTTTACAACA AATACCCAGA TGCTGTGGCC ACATGGCTAA ACCCTGACOC ATCTCAGAAG 240 

CAGAATCTGC TAGOCCCRCA GACCCTTOCA AGTAAGTCCA ACGAAAGOCA TGAOCA^TC 300 

GATGATATGG ATGATGAAGA TGATGATGAC CATGTGGACA GCCACGACTC CATTGACTOG 360 

AACGACTCTG ATGATGTAGA TGACACTGAT GATTCTCACC AGTCTGATGA GTCTCACCAT 420 

TCTGATGAAT CTGATGAACT GGTCACTGAT TTTCCCACGG ACXTTGCCACC AACXX»AGTT 480 

TTCACTCCAG TTGTCCCCAC AGTAGACACA TATGATGGCC GAGGTGATAG TGTGGTTTAT 54 0 

GGACTGAGGT CS^AAATCTAA GAAGTTTCGC AGACCTGACA TCCAGTACCC TGATGCTACA 600 

GAOQAGGACA TCACCTCACA CATGGAAAGC GAGGAGTTGA ATGGTGCATA CAAGGCCATC 660 

CXXGTTCCCC AGGACCTGAA OCCCCCTTCT GATTGG6ACA GCCGTGGGAA GGACAGTTAT 720 

GAAAOGAGTC AGCTGGATGA CCAGAGTGCT GAAACCCACA GCCACAAGCA GTCCAGATTA 780 

TATAAGOGGA AAGCCAATGA TGAGAGCAAT GAGCATTCCG ATGTGATTGA TAGTCAGGAA 840 

CTTTCCAAAG TCAGCCGTGA ATTCCACAGC CATGAATTTC ACAGCCATGA AGATATGCTG 900 

GTTGTAGACC CCAAAAGTAA GGAAGAAGAT AAACACCTGA AATTTCGTAT TTCT CATG AA 960 

TTAGATAGTG CATCTTCTGA GGTCAATTAA AAGGAGAAAA AATACAATTT CTCACT TTGC 1020 

ATTTAGTCAA AAGAAAAAAT GCTTTATAGC AAAATGAAA6 AGAACATGAA ATGCnC lTT 1080 

CTCAGTTTAT TGGTTGAATG TGTATCTATT tGAGTCTGGA AATAACTAAT GTGTTTGATA 1140 

ATTAGTTTAG TTTGTGGCTT CATGGAAACT CCCTGTAAAC TAAAAGCTTC AGGGTTATGT 1200 

CTATGTTCAT TCTATAGAAG AAATGCAAAC TATCAC TGTA TTTTAATATT TGTTATTCTC 1260 

TCATGAATAG AAATTTATGT AGAAGCAAAC AAAATACTTT TACCCACTTA AAAAGAGAAT 1320 

ATAACATTTT ATCTCACTAT AATCTTTTGT TTTTTAAGTT AGTGTATATT TTGTTGTGAT 1380 

tATCPTTTTG TGGTCTGAAT AAATCTTTTA TCTTGAATCT AATAAGAATT TGGTGGT GTC 1440 

AATTGCTTAT Tr G rrTTCCC ACGGTTGTCC AGCAATTAAT AAAACATAAC CTTTrrrACT ISOO 
GCCTAAAAAA AAAAAAAAAA AAAA 

Seq ID NO: 655 Protein sequence 
Protein Accession ft: IJP_OOOS73 

1 11 21 31 41 51 

I I I t I I 

MRIAVICFO. LGITCAIPVK QADSGSSEEK QLYNKYPDAV ATWLNPDPSQ KQULIAPCyTL 
PSKSNESHDH MDDMDDEDDD DHVDSQDSID SNDSDDVDDT DDSHQSDESH HSDESDELVT 
DFPTDLPATE VFTPWPTVD TYDGRGDSW YGLRSKSKKP RllPOIQYPDA TDEDITSHME 
SEBLNGAYXA IPVAQDLMAP SDWDSRGKDS YETSQUJDQS AETHSHKQSa LYRRKANDES 
NEHSDVIDSQ ELSKVSREFH SHEPHSHEDM LWDPKSKEE DXHLKFRISH ELDSASSEVN 
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Seq ID NO: 656 DNA sequence 

Nucleic Acid Accession KM_003108.1 

Coding sequence : 76 . . 1401 



1 

I 

GGGGTGGGAG 
GCCCTGCAAC 
CGGGAGGCGC 
GAGAGCGACX: 
TTCATGGTAT 
AACGCCXSAGA 
ATCCCGTTCA 
TACAAGTACC 
CAGAGCCCAG 
GGTGCCAAGA 
GCGGGCGCCA 
GAGGACTAOG 
ACGGTCAAGT 
CAGCTGCAGA 
CTGCAGCCGC 
CCCGCXaVGCC 
GAGGTGCGGG 
AAGAACA7CA 
TOSCGCTCGG 
GAGGACGCCG 
GCCAGCGAGC 
GATAAGGATT 
TACTGCACGC 
6ACCTGGTGT 
AGCTGGGTTC 
ATGATGGTGG 
ATATTGATAA 
TTAAAGTGAA 
TCCTTTATCG 
■ AAAATGTGTT 
GAGGGGGCGG 
GTOGGTCTTT 
TCTAGGGAGT 
TTTTTAACAA 



11 
I 

GGGGAGGGGG 
GGATCATGGT 
TGGACACGGA 
CAGACTGGTG 
GGTCCAAGAT 
TCTCCAAGAG 
TCCGGGAGGC 
GGCCCCGGAA 
AGAAGAGCGC 
CCTCCAAGGG 
AGGCGGGCGC 
T6CTGGGCAG 
GCGTGTTTCT 
TCAAACAGGA 
CGGGGCAGCA 
CTACGCTGAG 
CCGGCGCGAC 
CCAAGCAGCA 
TGTCCACCTC 
ACGACCTGAT 
AGCA6CTGGG 
TGGATTCGTT 
CGGAGCTGAG 
TCACATATTG 
CTTGGGAGGA 
TGTTGATGGT 
GATGTCGTGA 
ATGAGTAGTT 
TGTCTCAAGG 
TTTGTAATTA 
CGCGGCGGAG 
GAAGTCTGGA 
TGGTG6AGAT 
AAAAAGGG 



21 
I 

ACCTCCGCAC 
GCAGCAGGCG 
GGAGGGOGAA 
CAAGACGGCG 
CGAACGCAGG 
GCTGGGCAAG 
GGAGOGGCTG 
AAAGCCCAAA 
GGCCGGCGGC 
CTCCAGCAAG 
GGGCAAGGGG 
CCTGCG0GT6 
GGATGAGGAC 
GCCGGACGAG 
GCCX3TCGCAG 
CAGCTOGGCG 
CTCXK3GCGCC 
CCCGCCGCCG 
CTCXjTCCAGC 
GTTCGACCTG 
GGGCGGGGGG 
CAGCX3AGGGC 
CGAGATGATC 
AAAGGCGCCC 
AGTTGTAGTG 
GGCGGTGGTA 
CGCAAAGAAA 
TTTAAACATT 
TAGTTGCATA 
CTATTTCTTT 
GGGAGGTAGG 
AGACGTCTGC 
ATTTTTTTTT 



31 
I 

GAGACCCAGC 
GAGAGCTTGG 
TTCATGGCTT 
TCGGGCCACA 
AAGATCATGG 
CGCTGGAAAA 
CGGCTCAAGC 
ATGGACCCCT 
GGCGGCGGGA 
AAATGCX3GCA 
GCCCAGTCCX3 
AGCGQCTCGG 
GACGACGACG 
GAGGACGAGG 
CTGCTGAGAC 
GAGTCCCCCG 
GGGGGCGGCA 
CTCGCGCAGC 
AGCAGCGGCA 
AGCTTGAATT 
GCGGCCGGGA 
AGCCTGGGCT 
GCGGGGGACT 
GCTGCTCGCr 
GTGATGATGA 
GGGTGGAGGG 
TTGGAAAACA 
TTTCCTGTCC 
CCTAGTCTGG 
TTCCTGAAAT 
ACCCGCTCCC 
AGAGGACCCT 
CTTAAGAGAA 



41 

1 

GGCCCGGGTT 
AAGOGGAGAG 
GCAGCCCGGT 
TCAAGCGGCC 
AGCAGTCTCC 
TGCTGAAGGA 
ACATGGCOGA 
CGGCCAAGCC 
GCGCGGGCGG 
AGCTCAAGGC 
GGGACTACGG 
GCGG06GCG6 
ACGAOGACGA 
AACCACCGCA 
GCTACAACGT 
AGGGAGCGAG 
GCCGCCTCTA 
CCGCGCTGTC 
GCAGCAGCGG 
TCTCTCAAAG 
ACCTGTCCCT 
CCCACTTCGA 
GGCTGGAGGC 
CTTTCTCTCG 
TQATGA3GAT 
GA6AGAAGAA 
TGATGAAAAT 
TTTTTTTGTC 
AGTTGTGATT 
TCGTGATTGC 
GAAGGCGCTG 
TTTGGCAGCA 
CTTAAA6AAC 



51 
I 

GGAGCGTCCA 
CAACCTGCCC 
GGCCCTGGAC 
GATGAACGCG 
GGACATGCAC 
CAGCGAGAAG 
CTACCCOGAC 
CAGCGCCAGC 
AGGCGCGGGC 
CCCCGCGGCC 
GGGCCCCGGC 
OGCGGGCAAG 
OCSACGAGCTG 
CCAGCAGCTC 
CGCCAAAGTG 
CCTCTACGAC 
CTACAGCTTC 
GCCCGCGTCC 
CAGCAGCGGC 
CGCGCACA6C 
6TCGCTGGTG 
GTTCCCCGAC 
GAACTTCTCC 
GAGGGTGCAG 
AATGATGATG 
GATGCTGATG 
TTTOSTGGAG 
CCCCCTCCCT 
ATTTTCCCAA 
AACAAAGGCA 
TTTGAAGCTT 
CAACTGTTAC 
TGGTGATTTT 



Seq ID NO: 657 Protein sequence 
Protein Accession jf: NP_003099.1 

21 
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1 11 21 31 41 51 

MVQQAESLEA ESMLPREALD TEEGEFMACS PVALDESDPD WdCTASOTIK RPMNAfWWS 
KIERRKIMEQ SPDMHNAEIS KRLGKRWKML KDSBKIPPIR EAERIiRLKHM ADYPDYKVRP 
EKKPiQxiDPSA KPSASQSPSK SAAGGGGGSA GGGAGGAKTS KGSSKKOGKL KAPAAAGAXA 
GAGKAAQSGD YGGAGDDYVL GSLRVSGSGG G6AGKTVKCV FU>EDO0D0O DDDEL0U3IK 
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QEPDBBDEEP PHQQLLQPPG QQPSQLLRRY KVAKVPASPT LSSSAESPGG ASL YDSV RftG 300 

ATSGAGGGSR LYYSFRNITK QHFPPLAQFA LSPASSRSVS TSSSSSSGSS SGSSGEDADD 360 

LMFDLSUIFS QSAESASEQQ USCGAMiSSa, SLSLVDKDID S?S3GSLGSH FEFPDYCTFS 420 
liSEMIAGDSfL BANFSDIiVFT Y 

Seq ID i:0: 658 DNA sequence 
Nucleic Acid Accession 8: KM_00ni9 
Coding sequence: 123.. 1418 

1 11 21 31 41 51 

1 1 I 1 i I 

GGGC6CA6CG GGGCCCGTCT GCAGCAAGTG ACG6ACGGCC GGGAOGGGOG CCTGOCCCCT 60 

CTGCCACCTG GGGOGGTGOG GGCCCGGAGC COGGAGCCXX* GGTAGCGCCT AGAGCOGGCG 120 

CGATGCACGT GCGCTCACTG CGAGCTGCGG CGCOGCACAG CTTCGTGGCG CTCTGGGCAC 180 

CCCTGTTCCT GCTGCGCTCC GCCCTGGCCG ACTTCAGCCT GGACAACX3AG GTGCACTOGA 240 

GCTTCATCCA CCGGOGCCTC OGCACCCAGG AGCGGCGGGA GATGCAGCGC GAGATCXTTCT 300 

CCATTTTOGG CTTGCCCCAC CX3CCCGCGCC CGCACCTCCA GGGCAAGCAC AACTCGGCAC 360 

CCATGTTCAT GCTGGACCTG TACAACGCCA TC3GOGGTGGA GGAGGGOOGC GGGCCOGGCG 420 

GCCAGGGCTT CTCCTACCCC TACAAGGCCG TCTTCAGTAC CCAGOGCCCC CCTCTGGCCA 480 

GCCTGCAACSA TAGCCATTTC CTCACCGACG CCGACATGGT CATGAGCITC GTCAACCTCG 540 

TCGAACATGA CAAGGAATTC TTCCACCCAC GCTACX»CCA TOGAGAGTTC CGGTTTCATC 600 

TTTCCAAGAT CCCAGAAGCG GAAGCTGTCA OGGCAGCOGA ATTCCX3GATC TACAAGGACT 660 

ACATCCGGGA ACX3CTT0GAC AATGAGACGT TCOGGATCAG OGTTTATCAG GTCCTCCAGG 720 

AGCACTTGGG CAGGGAATCG GATCTCTTCC TGCTCGACAG CCGTACCCTC TGGGCCTOGG 780 

AGGACGGCT6 OCTGGTGTTT GACATCACAG CCACCAGCAA CCACTGGGTG GTCAATCCGC 840 

GGCACAACCT GGGCCTGCAG CTCTOGGTGG AGAOGCTGGA TGGGCAGAGC ATCAACCCCA 900 

AGTTGGOGGG CCTGATTGGG CGGCACGGGC CCCAGAACAA GCAGCCCTTC ATGGTGGCTT 960 

TCTTCAAGGC CACGGAGGTC CACTTCCGCA GCATCOOCTC CACGGGGAGC AAACAGOGCA 1020 

GCCAGAACOG CTOCAAGAOG CCCAAGAACC AGGAAGCCCT GCGGATGGCC AAOGTGGCAG 1080 

AGAACAGCAG CAGGGACCAG AGGCAGGCCT GTAAGAAGCA CGAGCTGTAT GTCAGCTTCC 1140 

GAGACCTX3GG CTGGCAGGAC TGGATCATOG CGCCTGAAGG CTACGCCGCC TACTACTGTG 1200 

AGGGGGAGTG TGCCTTCCCT CTQAACTCCT ACATGAAOGC CACCAACCAC GCCATCGTGC 1260 

AGACGCTGGT CCACTTCATC AACCOSGAAA CGGTGCOCAA GCCCTGCTGT GCXXCCAOX 1320 

AGCTCAATGC CATCTCCGTC CTCTACTTCG ATGACAGCTC CAACGTCATC CTGAAGAAAT 1380 

ACAGAAACAT GGTGGTCCGG GCCTGTGGCT GCCACTAGCT CCTCCXiAGAA TTCAGACCCT 1440 

TTGGGGCCAA GTTTTTCTGG ATCCTCCATT GCTOGCCTTG GCCAGGAACC AGCAGACCAA 1500 

CTGCCTTTTG TGAGAOrTTC CCCTCCCTAT CCGCAACTTT AAAGGTGTGA GAGTATTAGG IS60 

AAACATGAGC AGCRTATGGC TTTTGATCAG TTTTTCAGTG GCAGCATCCA ATGAACAAGA 1620 

TCCTACAAGC TGTGCAGGCA AAACCTAGCA GGAAAAAAAA ACAACGCATA AAGAAAAATG 1680 

GCCGGGCCAG GTCATTCGCT GGGAAGTCTC AGCCATGCAC GGACTOGTTT CCAGAGGTAA- 1740 

TTATGAGCGC CTACCA6CCA GGCCACCCAG COGTGGGAGG AAGGGGGCXTT GGCAAGGGGT 1800 

GGGCACATTG GTGTCTGTGC GAAAGGAAAA TT6ACCCGGA AGTTCCTGTA ATAAATGTCA 1860 
CAATAAAAC6 AATGAATG 

Seq ID NO: 659 Protein sequence 
Protein Accession #: NP_001710 

1 11 21 31 41 51 

1 1 I I i i 

KHVRSLRAAA PHSFVALWAP LFLLRSALAD PSLDNEVHSS PIHRRLRSQE RREMQREILS 60 

ILGLPHRPRP HLQGKHNSAP MFMLDLYMAM AVEEGQGPGG OGFSYPVKAV FSTQGPPLAS 120 

LQDSHFIjTDA DMVMSFVNLV EHDKEFPHPR YHHREFRFDL SKIPEGEAVT AAEFRIYKDY 180 

IRERFDNETF RISVYQVLQE HLGRESDLFL LDSRTLWASE EGWIiVFDITA TSNHWWNPR 240 

HZfLGLQIiSVE TLDGQSINPK LAGIilGRHGP QNKQPFMVAP FKATBVHFRS IRSTGSKQRS 300 

QNRSKTPXKQ EALRMANVAE NSSSDQRQAC KXHELYVSFR DLGHQDWIIA FEGYAAYYCB 360 

GECAPPLMSY MMATNKAIVQ TLVKPINPET VPKPCCAPTQ LNAISVLYFD DSSNVILKKY 420 
RNMWRACGC H 

Seq ID KO: 660 DNA sequence 

Nucleic Acid Accession 8: Eos sequence 

Coding sequence: 2 11.. is 95 

1 11 21 31 41 51 

I I t I i i 

GGATCTGAGG GGCGCCCAGT CACTTCCTCC AGGTTCTCGT GCTGGGCGGG AGGAGCGGAT 60 

GGGGCTTGGG AGGCAGCCTG CTCTCCAGTC CCTATCCACC CACAGGTTTT TTGGGTOGGA 120 

GA6GAATTAT CTGATAAAAT TCCTGGCTTA ATATTTTTAA AAACGGAGAG TTTTTAAAAA 180 

TGATmrTT CCCTOGAAAA TOACCTTTTT ATGCTTCGAA GCAGTTTGTC AACCAGCATA 240 

GTGCTTTTTC TTTTCTCTTC TTTTTCTACG ATAAAT6AAA GCATTTCTTC AAGAAAAAGG 3 00 

CACAGGTTCC TTGAACAGCT GGATTCTGAT GGCACCATTA CTATAGAGGA GCA6ATTGTC 360 

CTTGTGCTGA AAGCX3AAAGT ACAATGTGAA CTCAACATCA CAGCTCAACT CCAGQAGGGA 420 

GAAGGTAATT GTTTCCCTGA ATGGGATGGA CTCATTTGTT G6CCCAGAGG AACAGTGGGG 480 

AAAATATOGG CTGTTCCATG CCCTOCTTAT ATTTATGACT TCAACCATAA AGGAGTTGCT 540 

TTCOGACACT GTAACCCCAA TGGAACATGG GATTTTATCC ACAGCTTAAA TAAAACATGG 600 

GCCAATTATT CAGACTGCCT TCGCTTTCTG CAGCCAGATA TCAGCATAGG AAAGCAAGAA 660 

TTCTTTGAAC GCCrCTATGT AATGTATACC OTTGGCTACT CCATCTCTTT TGGTTCCTTG 720 

GCT6TGGCTA TTCTCATCAT TGGTTACTTC AGACGATTGC ATTGCACTA6 GAACTATATC 780 

CACATGCACT TATTTGTGTC TTTCATGCTG AGAGCTACAA GCATCTTTGT CAAAGACA6A 840 

GTAGTCCATG CTCACATAGG AGTAAAGGAG CTGGAGTCCC TAATAATGCA GGATGACCCA 900 

CAAAATTCCA TTGAGGCAAC TTCTGTGGAC AAATCACAAT ATATOGGGTG CAAGATTGCT 960 

GTTGTGATGT TTATTTACTT CCTGGCTACA AATTATTATT GGATCCTGGT GGAAGGTCTC 1020 

TACCTGCATA ATCTCATCTT TGTGGCTrTC TTTTCGGACA CCAAATACCT GTGGGGCTTC 1080 

ATCTTGATAG GCTOGGGGTT TCCAGCAGCA TTTGTTGCAfi CATGGGCTGT G GCACG AGCA 1140 

ACTCTGGCTG ATGOGAGGTG CTGGGAACTT AGTGCTGGAG ACATCAAGTG GATTT ATCAA 1200 

GCAOOGATCT TAGCAGCTAT TGGGCTGAAT TTTATTCTGT TTCTGAATAC GGTTAOAGTT 1260 

CTAGCTACCA AAATCTGGGA GACCAATGCA GTTGGGCAT6 ACACAAGGAA GCAATACAG6 1320 

AAACT6GCCA AATOGACACT GGTCCTGGTC CTAGTCTTTO GAGTGCATTA CATCCTCTTC 2380 
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GTATGCCTCC CrCACTCCTT CACTGGGCTC GGGTGGGAGA TCCGCATGCA CTGTGAGCTC 1440 

TTCTTCAACT CCTTTCAGGG ' m v r i'lXf m TCTATCATCT ACTGCTACTG CAATGGAGAG ISOO 

GTTCAGGCAG AGGTGAAGAA GATGTGGAGT OGGTGGAATC TCTCCX3TGGA CTGGAAAAGG 1560 

ACACOGCCAT GTGGCAGCG6 CA6A1XS0GGC TCAGTGCTCA CCAOOSTGAC GCACAGCACC 1620 

AGCAGCCAGT CACAGGTGGC GGOCAGCACA OGCATGGTGC TTATCTCTGG CAAAGCTGCC 1680 

AAGATCGCCA GCAGACAGCC TGACAGCCAC ATCACTTTAC CTGGCTATGT CTGGAGTAAC 1740 

TCAGAGCAGG ACTCCCtGCC ACACTCTTTC CAOGAGGAGA CCAAGGAAGA TAGTQC3GAGG 1800 

CAQGGAGATG ATATTCTAAT GGAGAA GOCT TCCAGGCCTA TGGAATCTAA CGCAGACACT 1860 
GAAGGAT6CC AAGGAGAAAC TGAGGATGTT CTCTGA 

Seq ID NO: 661 Protein sequence 
Protein Accession ft: Eos sequence 

1 11 21 31 41 SI 

I I I 1 1 I 

HLRSSLSTSl VLFLPSSFST INBSISSHKH HRFLEQU5SD CTITIBBOIV LVUCAKVQCE 60 

UJITAQLQEG EGKCFPEWDG LICWPaGTVG KISAVPCPPY lYDPNHKGVA FRHCMiWGTW 120 

DPMHSLNKTW ANYSDCLRFL QPDISIGKQE PPERLYVMYT VGYSISPGSL AVAILIIGYF 180 

RRLHCTRNYI HMHLPVSFMIi RATSIFVKDR WHAHIGVKB IxESI<IKQZ5DP QNSIEATSVD 240 

KSQYIGCKZA WMPIYFLAT NYYWILVE5GL YLHNLIPVAF PSDTKYLWGF ILIGWGFPAA 300 

FVAAHAVARA TliADARCWEL SAGDIKHIYQ APILAAIGLN PlIiFLNTVmV LATKIWETNA 360 

VGHDTRKQYR KLAKSTLVLV LVPGVHYIVF VCLPHSPTX3L GWBIRMKCSL PFNSFQGFFV 420 

SZIYCYCHGE VQAEVKKMWS RHMLSVDHKR TPPCGSRRG6 SVLTTVT HST SSQSQVAAST 480 

RKVLISGKAA KIASRQPDSH ITLPGYVWSN SEQDCLPBSF HEBTKEDSGR Q6DDIWEKP 540 
SRPMESNPOT EGGQGETEDV L 

Seq ID NOi 662 DNA sequence 
Hucleic Acid Accession ft: NM_OOS04a 
Coding sequence: 143.. 1795 

1 11 21 31 41 51 

i I I i I I 

GGCCGGTGGC COGGGCCCGA OCACCCCAGC TGCGCGTCGT TACTGGCCAC AAGTTTGCTC 60 

TCGGCCAGCC AAGTTGGCSiA CTTGGAAGCT TCTCCCGGGC TCTGGAGGAG GGTCCCTGCT 120 

TCTTCCTACA GCCGTTCCGG GCATGGCCGG GCTGGGGGCG TCGCTCCAOG TCTGGGGTTG 180 

GCTAATGCTC GGCAGCTGCC TCCTG60CAG AGCCCAGCTG GATTCTGATG GCACCATTAC 240 

TATAGAGGAG CAGATTGTCC TTGTGCTGAA ^GCGAAAGTA CAATGTGAAC TCAACATCAC 300 

AGCTCAACTC CAGGAGGGAG AAGGTAATTG TTTCCCTGAA TGGGATGGAC TCATTTGTTG 360 

GCCCAGAGGA ACAGTCGGGA AAATATCGGC TGTTCCATGC CCTCCTTATA T TTAT GACTT 420 

CAACCATAAA GGAGTTGCTT TCCGACACTG TAACCCCAAT GGAACATGGG ATTTTATGCA 480 

CAGCTTAAAT AAAACATCGG CCAATTATTC AGACTGCCTT 05CTTTCTGC AGCCAGATAT 540 

CAGCATAGGA AAGCAAGAAT TCTTTGAACG CCTCTATGTA ATGTATACOG TTGGCTACTC 600 

CATCTCTTTT GGTTCCTTGG CTGTGGCTAT TCTCATCATT GGTTACTTCA GACGATTGCA 660 

TTGCACTAGG AACTATATCC ACATGCACTT ATTTGTGTCT TTCATGCTGA GAGCTACAAG 720 

CATCTTTGTC AAAGACAGAG TAGTCCATGC TCACATAGGA GTAAAGGAGC TGGAGTCCCT 780 

AATAATGCAG GATGACCCAC AAAATTCCAT TGAGGCAACT TCTGTGGACA AATCACAATA 840 

TATCGG6TGC AAGATTGCTG TTGTGATGTT TATTTACTTC CTGGCTACAA ATTATTATTG 900 

GATCCTGGTG GAAGGTCTCT ACCTGCATAA TCTCATCTTT GTGGCTTTCT TTTCGGACAC 960 

CAAATACCTG TGGGGCTTCA TCTTGAtAGG CTGGGGGTTT OCAGCAGCAT TTGTTGCAGC 1020 

ATGGGCTGTG GCACGAGCAA CTCTGGCTGA TGOSAGGTGC TGGC3AACTTA GTGCTGGAGA 1080 

CATCAAGTGC ATTTATCAAG CACCX3ATCTT AGCAGCTATT GGGCTGAATT TTATTCTGTT 1140 

TCTGAATACG GTTAGAGTTC TAGCTACCAA AATCTGGGAG ACCAATGCAG TTGGGCATGA 1200 

CACAAGGAAG CAATACAGGA AACTGGCCAA ATOGACACTG GTCCTGGTCC TAGTCTTTGG 1260 

AGTCCATTAC ATCGTGTTCG TATGCCTGCC TCACTCCTTC ACTGGGCTCG GGTGGGAGAT 1320 

COGCATGCAC TGTGAGCTCT TCTTCAACTC CTTTCAGGGT TTCTTTOTGr CTATCATCTA 1380 

CTGCTACTGC AATGGAGAGG TTCAGGCAGA GGTGAAGAAG ATGTGGAGTC GGTGGAATCT 1440 

CTCCX?rGGAC TGGAAAAGGA CACCX3CCATG TGGCAGCCGC AGATGCGGCT CAGTGCTCAC 1500 

CACOGTGACG CACAGCACCA GCAGCCAGTC ACAGGTGGCG GCCAGCACAC GCATGGTGCT 1560 

TATCTCTGGC AAAGCTGCCA AGATOGCCAG CAGACAGCCT GACAGCCACA TCACTTTACC 1620 

TCGCTATGTC TGGAGTAACT CAGAGCAGGA CTGCCTGCCA CACTCTTTCC ACX3AGGAGAC 1680 

CAAGGAAGAT AGTGGGAG6C AGGGAGATGA TATTCTAATG GAGAAGCCTT CCAGGCCTAT 1740 

GGAATCTAAC CCAGACACTG AAGGATGCCA AGGAGAAACT GAGGATGTTC TCTGAATGGA 1800 

CATTTGTGGC TGACTTTCAT GGGCTGGTCC AATGGCTGGT TGTGTGAGAG GGCTTGGCTG 1860 

ATACrCCTAT GCTTGAGTTC AAAGGCTGAA AATTCAGTTA AGGTGTTACT TAATAATAGT 1920 

TTTTAGGCTC CATGAATTGG CTCCTGTAAA TACTAACGAC ATGAAAATGC AAGTGTCAAT 1980 

GGAGTAGTTT ATTACCTTCT ATTGGCATCA AGTTTTCCTC TAAATTAATG TATGGTATTT 2040 

GCTCTGTGAT TGTTCATTTT TTTCTGCTAC TTTTGGGTAG AAAAAAGATT CAATTGCTTG 2100 

GCTGTAGCTT TCTCTCATAT ATATCACCCT AAATATAATG AAGATCTTTT AGTGTGTATC 2160 

ATTTTCCrrr TAGAAACTAG TATTCTCTTA TTTCTTACTT TAATGTACTT CTATCACTGC 2220 

ATTTATTTTG OCTGTGCATA GGAGCAATTA GGATCTAAAA AAATATATGG GAAGATAAAA 2280 

GATCTAAGAA CAAGTACTTG CTGGAAAATT AGTTGGCTGG ACATTGATAA AATAATGCAT 2340 

TTATAACAAT TACATCTGTT TTTGGGAACA AGGAAAATTT CTCAAAAAAG AATATTTCAC 2400 

ACATCCCTTC TTTTGAATGG CCTCTTTGTG ACCAGCCAGA CCTCAGGTCT TCACTCTTTC 2460 

TTCTTTGTAA ACCATGTCAT GTGGAAAGAT TTCCTCAGTT AGTGAGCTTG TGTCTG CAAA 2520 

TTGATTTTGT TTGTAATGTA TTTTGATAGC AAATCATGCT GCATCTATAT CTTTTTCTTG 2580 

TTTGAGCTGT TACTACATTG TACATGGCAT GTGG6ATCAA TTAAAAATTT GTTTTAAAAA 2640 
T 

Seq ID NO: 663 Protein sequence 
Protein Accession ft: NP_005039 

1 11 21 31 41 SI 

1 I I I i ) 

MAGLGASLHV WGWLMUJSCL LARAQLDSDG TITIEBQIVL VLKAKVQCEL NZTAQM)BGE 60 

<aiCFPEWDGL ICWPRGTVGK ISAVPCPPYI YDFNHKGVAP RHCNPNGTWD FMHSLNKTWA 120 

KYSDCLRFDQ PDISIGXQEF FERLYVMyTV 6YSISFGSLA VAILIIGYFR RLHCTRNYIH 180 

MHLFVSPMLR ATSIPVKDHV VKRHIGVKEL BSLIHQDOPQ NSIEATSVDK SQVIGCKIAV 240 
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VMFIYFLATH YYWILVEGLY LHNLIPVAFP SDTKYUJGPI LIGWGFPAAP VAAKAVASAT 300 

liADARCWELS AGDIKWIYOA PILAAIGI«NP ILFUnVRVL ATKIWBTKAV GH2JTRK0YRK 360 

LAKSTLVLVL VPGVHYIVFV CLPHSFTGLG KEIHKECKLF FNSFQGrFVS IIYCYCNGEV 420 

QASViOWSR loa.SVDsnCBT PPCGSRRG6S VLTTVTHSTS SQSOVAASTR HVLISGKAMC 480 

lASRQFDSHI TLPGYVWSKS EQOCLPHSFH EETKEDSGRQ GDDILMEKPS RPMESHPOTB 540 
GGQGETEDVL 

Seq ID MO: 664 D51A sequence 
Bbcleic Acid Accession 9: KM_012152 
Coding sequence: 43.. 1104 

1 li 21 31 41 51 

I ) I 1 i < 

CTTCTTTAAA TTTCTTTCTA GGATGTTCAC TTCTTCTCCA CAATGAATGA GTCTGACTAT 60 

GACAAGCACA TGGACTTTTT TTATAATAG6 AGCA ACACTG ATA CPgPCGA TGACTGGACA 120 

GGAACAAAGC TTGTGATTGT TTTGTGTGTT GGGAOGTTTT TCTGCCTGTT TATTTTTTTT 180 

TCTAATTCTC TGGTCATCGC GGCAGTGATC AAAAACAGAA AATTTCATTT CCCCTTCTAC 240 

TACCTGTTGG CTAATTTAGC TGCTGCOGAT TTCTTOGCTG GAATTGOCTA TGTATTCCTG 300 

ATGTTTAACA CAGGCCCAGT TTCAAAAACT TTGACTGTCA ACCGCTGGTT TCTCCGTCAG 360 

GGGCTTCTGG ACAGTAGCTT GACTGCTTCC CTCACCAACr TGCTGGTTAT CGCCGTGGAG 420 

AGGCACATGT CAATCATGAG GATGCGGGTC CATAGCAACC TGACCAAAAA GAGGGTGACA 480 

CTGCTCATTT TGCTTGTCTG GGCCATCGCC ATTTTTATGG GGGCGGTCCC CACACTGGGC 540 

TGGAATTGCC TCTGCAACAT CTCTGCCTGC TCTTCCCTGG CCCCCATTTA CAGC AGGAG T 600 

TACCTTGTTT TCTGGACACT GTCCAACCTC ATGGCCTTCC TCATCATGGT TGTGGTGTAC 660 

CTGCGGATCT ACGTGTACGT CAAGAGGAAA ACCAACXTTCT TGTCTCOGCA TACAAGTGGG 720 

TCCATCAfiCC GCCGGAGGAC ACCXATGAAG CTAATGAAGA CGGTGATGAC TGTCTTAGGG 780 

GOGTTTGTGG TATGCTGGAC CCCGGGCCTG GTGGTTCTGC TCCTOGACGG CCTGAACTGC 840 

AGGCAGTGTG GOGTGCAGCA TGTGAAAAGG TGGTTCCTGC TGCTGGOGCT GCTCAACTCC 900 

GTCGTGAACC CCATCATCTA CTCCTACAAG GAOGAGGACA T6TATQGCAC CATGAAGAAG 960 

ATGATCTGCT GCTTCTCTCA GGAGAACGCA GRGAGGOGTC CCTCTOGCAT CCCCTCCACA 1020 

GTCCTCAGCA GGAGTGACAC AGGCAGCCAG TACATAGftGG ATAGTATTAG CCAAGGTGCA 1080 

GTCTGCAATA AAAGCACTTC CTAAACTCTG GATGCXncCTC GGCCOWXJCA QGTGATOACT 1140 
GTCTTAGG 

Seq ID KOt 665 Protein sequence 
Protein Accession 9: NP_036284 

1 11 21 31 41 51 

I 11 [ I 1 

KKECHYDKHM DFFYNRSNTD TVDDWTCTKL VIVLCVGTFF CLPIPFSNSL VIAAVIKNRK 60 

FHFPFYYLLA NLAAADFPAG lAYVFLMFNT GPVSXTLTVN RWPLRQGLLD SSLTASIiTNL 120 

LVIAVERHMS IMRMRVHSNL TKKRVTIiLIL LVWAIAIFMG AVPTLGWNCIi OTISACSSLA 180 

PIYSRSYLVF WTVSNLMAFL IMWVYLRIY VYVKUKTUVL SPHTSGSISR RRTPMKLNKT 240 

VMTVLGAFW CWTPGLVVI.L LDGLlICItQGG VQHVKRHFLL LALLKSWNP IIYSYKDEDM 300 
YGTMKKMICC FSQQIPERRP SRIPSTVLSR SDTGSQYIED SISQCAVCNK STS 

Seq ID NO: 666 DMA sequence 
Nucleic Acid Accession |i NM_002a21 
Coding sequence: 150.. 3 3 62 

1 11 21 31 41 51 

I 1 ) i i I 

AACTCCCGCC TCGGGACGCC TCGGGGTCGG GCTCCGGCTG CGGCTGCTGC TGCGGCGCCC 60 

GCGCTCCGGT GC3GTCCGCCT CCTGTGCCOG CCGCGGAGCA GTCTGCGGCC CGCCGTGGGC 120 

CCTCAGCTCC TTTTCCTGAG CCCGCCGOGA TGGGAGCTGC GOGGGGATCC CCGGCCAGAC 180 

CCCGCCGGTT GCCTCTGCTC AGCGTCCTGC TGCTGCOGCT GCTGGGCGGT ACCCAGACAG 240 

CCATTGTCTT CATCAAGCAG CCGTCCTCCC AGGATGCACT GCAGGGGCGC OGGGCGCTGC 300 

TTCGCTGTGA GGTTGAGGCT CCGGGCCCGG TACATGTGTA CTGGCTGCTC GATGGGGCCC 360 

CTGTCCAGGA CACGGAGCCG CGTTTCGCCC AGGGCAGCAG CCTGAGCTTT GCAGCTGTGG 420 

ACCGGCTGCA GQACTCTGGC ACCTTCCAGT GTGTGGCTCG GGATGATGTC ACTGGAGAAG 480 

AAGCCCGCAG TGCCAACGCC TCCTTCAACA TCAAATGGAT T6AGGCAGGT CCTGTGGTCC 540 

TGAAGCATCC AGCCTCGGAA GCTGAGATCC AGCCACAGAC OCAGGTCACA CTTCGTTGCC 600 

ACATTGATGG GCACCCTCGG CCCACCTACC AATGGTTCCG AGATGGGACC OCCCTTTCTG 660 

ATGGTCAGAG CAACCACACA GTCAGCAGCA AGGAGCGGAA CCTGACGCTC OGGC CAGC TG 720 

GTCCTGAGCA TAGTGGGCTG TATTCCTGCT GCGCCCACAG TGCTTTTGGC CAGGCTTGCA 780 

GCAGCCAGAA CTTCACCTTG AGCATTGCTG ATGAAAGCTT TGCCAGG6TG GTGCTGGCAC 840 

CCCAGGACGT GGTAGTAGCG AGGTATGAGG AGGCCATGTT CCATTGCCAG TTCTCAGCCC 900 

AGCCACCCCC GAGCCTGCAG TGGCTCTTTG AGGATGAGAC TCCCATCACT AACCGCAGTC 960 

GCCCCCCACA CCTCCGCAGA GCCACAGTGT TTGCCAACGG GTCTCTGCTG CTGACCCAGG 1020 

TCCGGCCACG CAATGCAGGG ATCTACCGCT GCATTGGCCA GGGGCAGAGG GGCCCACCCA 1080 

TCATCCTGGA AGCCACACTT CACCTAGCAG AGATTGAAGA CATGCOGCTA TTTGAGCCAC 1140 

GGGTGTTTAC AGCTGCCAGC GAGGAGCGTG TGACCT6CCT TCCCCCCAAG GGTCTGCCAG 1200 

AGCCCAGGGT GTGGTGGGAO CAGGCGGOAG TC06GCTGCC CACCCATGGC AGGGTCTACC 1260 

AGAAG6GCCA OGAGCTGGTC TTGGOCAATA TTGCTGAAAG TGATGCTGGT GTCTACACCT 1320 

GCCACGCGGC CAACCTGGCT GGTCAGCGGA GACAGGATGT CAACATCACT GTGGCCACTG 1380 

TGCCCTCCTG GCTGAAGAAG CCCCAAGACA GCCAGCTGGA GGAGGGCAAA CCCGGCTACT 1440 

TGGATTGCCT GACCCAGGCC ACACCAAAAC CTACAGTTGT CTGGTACAGA AACCAGATGC 1500 

TCATCTCAGA GGACTCAOGG TTOGAGGTCT TCAAGAATGG GACCTTGCGC ATCAACAGCG 1560 

TGGAGGTGTA TGATGGGACA TGGTACCGTT GTATGAGCAG CACCCCAGCC GGCAGCATCG 1620 

AGGOGCAAGC CCGTGTCCAA GTGCTGGAAA AGCTCAAGTT CACACCACCA CCCCAGCCAC 1680 

AOCAGTGCAT GGAGTTT6AC AAGGAGGCCA CGGTQCCCTG TTCAGCCACA GGCCGAGAGA 1740 

AGCCCACTAT TAAGTGGGAA CGGGCAGATG GGAGCAGCCT CCCAGAGTGG GTGACAGACA 1800 

ACGCTGGGAC CCTGCATTTT GCCCGGGTGA CTOGAGATGA OGCTGGCAAC TACACTTGCA 1860 

TTGCCTCCAA CGGGCCGCAG GGCCAGATTC GTGCCCATGT CCAGCTCACT GTGGCAGTTT 1920 

TTATCACCTT CAAAGTGGAA CCAGAGOGTA CGACTGTGTA CCAGGGCCAC ACAGCCCTAC 1980 

TGCAGTGOGA G6CCCAGGGG GACCCCAAGC CGCTQATTCA GTGGAAAGGC AAGGACCGCA 2040 

TCCTGGACCC CACCAAGCTG GGACCCAOSA TGCACATCTT CCAGAATGGC TCOCTQGTGA 2100 
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TCCATGAOBT GGCCCCTGAO GACTCAGGOC GCTACACCTG 
ACATCAAGCA CftOOGAOGOC CCCCTCTAT6 TCGtGGACAA 
AGGGCCCTGG CAGCCCTCCC CCCTACAAGA TGATCCAGAC 
COGCTGTGGC CTACATCATT GCCGTGCTGG GCCTCATGTT 
AAGCCAAGOG GCTGCAGAAG CAGCCCGAGG GCGAGGAGCC 
GAG6GCCTTT GGAGAAOSGG CAGCCCTCAG CAGAGATCCA 
GCTTGGGCTC G6GCCOC3GGG GCCACCAACA AACGCCACAG 
TCCCACGGTC TACOCTGCAG CCCATCACCA CGCTGGGGAA 
TCCTGGCAAA GGCTCAGGGC TTGGAGGAGG GAGTGGCAGA 
GCCTGCAG.\C GAAGGATGAG CAGCAGCAGC TGGACTTCOG 
GGAAGCTGAA CCACGCCAAC GTGGTGCGGC TCCTGGGGCT 
ACTACATGGT GCTGGAATAT GTGGATCTGG GAGACCTCAA 
AGAGCAAGGA TGAAAAATTG AAGTCACAGC OCCTCAGCAC 
GCACCCAGGT AGGCCTGG6C ATGGAGCACC TGTGCAACAA 
TGGCTGCGCG TAACTG0CT6 GTCAGTGCCC AGAGACAAGT 
TCAGCAAGGA TGTGTACAAC AGTGAGTACT ACCACTTCOG 
GCTGGATGTC CCCCGAGGCC ATCCTGGACG GTGACTTCTC 
CCTTCGGTGT GCTGATGTGG GAAGTGTTTA CACATCGAGA 
CAGATGATGA AGTACTGGCA GATTTGCAGG CTGGGAAGGC 
GCTGCCCTTC CAAACTCTAT OGGCTGATGC AG06CTGCTG 
GGCCCTCCTT CAGTGA6ATT GCCAGCGCCC TGGGAGACAG 
GAGGAGGGAG CCCGCTCAGG ATGGCCTGGG CAGGGGAGGA 
CAGCATGATG GGCRAGATCC CTGTCC TCCT GGGCCCTGAG 
TTGCTGAGGT CTGAGCAGGG CCTGGCCTTT 
GGCTGACTTG GACCCAAACT GGGCGACTAG 
CTCTTCCTCT ATCAGGGACA GTGTGGGTGC 

TTCToxxrrr gaccgggtcc aactctgcca 

AGGCTTGGGA TGAGCTGGGT TTGTGGGGAG 
AGGGTTAATG AGTCTCTTGC CCACTGGTCC 
ACACACCAAG TGAGTCCTCC CCACTCTGGG 
CCCCACCCTT CTCTCCTTTC CTCATCCTAA 
CTTTTCACAC TATATAAACC GOCCTTTTTG 
TGCAGGGTGG GGTGGGTGGG CATGGGAGGT 
GCCATCCTTA CCCCACACTT TTATTGTTGT 
IVmTl 'G' lT TTTACACTCG CTGCTCTCAA 




Seg IB HO: 667 Protein sequence 
Protein Accession fti NP 002812 



MGAARGSPAR 
VHVYWLLDGA 
IKWIEAGPW 
KERNLTLRPA 
EAMFKCQFSA 
CIGQGQHGPP 
VRLPTHGRVY 
SQIjEEGKPGY 
CMSSTPAGSI 
GSSLPEWVTD 
TTVYQGHTAL 
RYTCIAGWSC 
GU4FYCKXRC 
KRHSTSDKMH 
LDFRRELEMF 
PLSTKQKVAL 
YHFRQAHVPL 
AGKARLPQPE 



11 

1 

PRRIiPLLSVL 
PVQDTERRPA 
LKHPASEAEI 
GPEHSGXiYSC 
QPPPSLQHLF 
IILEATLHLA 
QKGH£LVLAN 
LDCLTQATPK 
EAQARVQVIiE 
HAGTLHFARV 
LQCEAQGDPK 
KIKHTEAPLy 
KAKRLQKQPE 
PPRSSLQPIT 
GKLNHANWR 
CTQVAIiGMEW 
SHMSPBAILE 
GCPSKLYRLM 



21 
1 

LLPLLGGTOT 
QGSSLSFAAV 
OPQTQVTIiRC 
CABSAFGQAC 
EDSTPZTITRS 
EIEDMPLFEP 
lAESDAGVYT 
PTWWYRNQM 
KliKFTPPPQP 
TRDDAGNYTC 
PLIQWXGKDR 



GEEPEMECLN 
TLGKSEFGEV 
LLGLCREAEP 
LSNNRFVHKD 
GDFSTKSDVW 
QRCHALSPKD 



31 
I 

AIVPIKQPSS 
DRLQDSGTFQ 
HIDGHPRPTY 
SSQNPTXiSIA 
RPPHIiRRATV 
RVFTAGSEER 
CEAANLAGQR 
IiISEDSRFEV 
QQCMEFDKEA 
lASNGPQGQI 
ILDPTKLGPR 
EGPGSPPPYK 
G6PLQNGQPS 
FLAKAQGLSE 
HYMVLEYVDL 
LAARNCLVSA 
AFGVLMHEVF 
RPSFSEIASA 



CATZGCAOGC 
GCCTGTGCOG 
CATTGGGTTG 
CTACTGCAAG 
AGAGATGGAA 
AGAAGAAGTG 
CACAAGTGAT 
GAGTGAGTTT 
GACCCFGGTA 
GAOGGAGTTG 
GTGCCGGGAG 
GCAGTTCCTG 
CAAGCAGAAG 
C0C5CTTTGTG 
QAAGGTGTCT 
CCAGGCCTGG 
TACCAAGTCT 
GATGCCCCAT 
TAGACTTCCT 
GGCCCTCAGC 
CACOGTGGAC 
CATCTCTAGA 
GTGCCCTAGT 
CTCAOCCTCA 
TCGGCAGTTT 
CCCAATTTCT 
AACriTGCCT 
TTCTCAAOTT 
CTAGACCAGG 
CTGACCCAGA 
GATGAAGGAG 
GGGOGGCTTT 
CCTGGAGATG 
TTGTTTTGTT 
TTTTTTA 



41 
I 

QDALQGRRAL 
CVARDDVTGE 
QWFRDGTPLS 
DESPARWLA 
FANGSLLI.TQ 
VTCLPPKGLP 
RQDVKITVAT 
FKKGTLRINS 
TVPCSATGRE 
RAHVQLTVAV 
MHIFQKGSLV 
HZQTZ6I*SVG 
AEIQEEVALT 
GVABTLVLVK 
GDLKQFLRIS 
QRQVKVSALG 
THGEKPHQGQ 
LGDSTVDSKP 



AACAGCTGCA 
GAGGAGTOQG 
T0GGTGGGT6 
AAGCGCTGCA 
TGCCTCAAOG 
GCCTTGACCA 
AAGATGCACT 
GGGGAGGTGT 
CTTGTGAAGA 
GAGATGTTTG 
GCTGAGCCCC 
AGGATTTCCA 
GTG60CCTAT 
CATAAGGACT 
GCCCTGOGCC 
GTGCOGCTGC 
GATGTCTGGG 
GGTGGGCAGG 
CAGCCCGAGG 
CCCAAGGACC 
AGCAAGCOGT 
GGGAAGCTCA 
GCAACAGGCA 
TCCTTTGGGA 
CCCCTGCCAC 
GGCCTTCAAC 
GGGGAGGGCT 
CTGGGCACAfc 
ATTATAGAGG 
CCCAOGTCTT 
TTTTCAGGAG 
TATATGTAAT 
AGGAGGGTGG 
TTTTTGTTTT 



51 
1 

LRCEVEAPGP 
EARSANASFN 
DGQSNHTVSS 
PQDWVARYE 
VRPRHAGIYR 
BPSVWHEEAG 
VPSWIiKKPQD 
VEVYDGTWYR 
KPTIKWERAD 
PITFKVEPER 
IHDVAPEOSG 
AAVAVZIAVL 
SLGSGPAAIN 
SLQTKDBQQQ 
KSKDEKLKSQ 
LSKDVYNSBY 
ADOEVLADLQ 



Seq ID KO: 666 DNA sequence 

Nucleic Acid Accession ft: Eos sequence 

coding sequence: 1..1389 



1 
1 

AT6GGCTACC 
ACCCTTGTTT 
GTTCTCAACT 
GGGTTTCCTT 
GTTTTATTGA 
AAAACTTTCG 
ATAGCAATGA 
ATOCCAGGAG 
ACAGTTACCT 
TCCCTCATCT 
TCACTGGGTC 
ATTCAAGOSG 
TACAGTTCTC 
GTGATTTCTG 
TPCACCCAAG 
AGATTTTGTT 
GAGGTAATTG 
ACAGTGAtGG 
GTTCTAGAAC 
TGTTATCTGA 
ATGCTTGCCA 



11 
I 

AGAGGCAGGA 
CTGAACATGA 
CGATTATAGG 
TGGGAATATT 
TAAAAGGAGG 
GCTTTCCAG6 
TAAGTTACAA 
TTGATCCTGA 
TTACTCTGCC 
CTACAGGTTT 
CACACATACC 
TCGGGGTTAT 
TAGAAGAACC 
TATTTATCTG 
GGGACTTATT 
ATGGTGTCAC 
CCAATGTGTT 
TCATCACTGT 
TCAATGGTGT 
AACTGTCTGA 
TTGGTGCTGT 



21 
I 

GCCTGTCATC 
GTATAAAGAG 
ATCTGGTATA 
GCTTTTATTC 
GGCCCTCTCT 
GTATCTGCTC 
TATAATAGCT 
AAACGTGTTT 
TTTATCCTTG 
AACAACTCTG 
AAAAACAGAA 
GTCTTTTGCA 
CACAGTAGCT 
TATATTCTTT 
TGAAAATTAC 
TGTCATTTTG 
TTTTGGTGGG 
AGCCACGCTT 
GCTCTGTGCA 
AGAACCAAGG 
GGTGATGGTT 



31 
I 

CCGCCGCAGA 
AAAACCTGTC 
ATAGGATTGC 
TGGGTTTCAT 
GGAACA6ATA 
CTCTCTGTTC 
GGAGATACTT 
ATTGGTCGCC 
TACCGAAATA 
ATTCTTGGAA 
GACGCTTGGG 
TTTATTTGCC 
AAGTGGTC C C 
GCTACATGTG 
TGCAGAAATG 
ACATACCCTA 
AATCTTTCAT 
GTGTCATTGC 
ACTCCCCTCA 
ACACACTOCG 
TTTOGATTOQ 



41 

I 

GAGATTTAGA 
AGTCTGCTGC 
CTTATTCAAT 
ATGTTAOGGA 
CCTACCAGTC 
TTCAGTTTTT 
TGAGCAAAGT 
ACTTCATTAT 
TAGCAAAGCT 
TTGTAATGGC 
TATTTGCAAA 
ACCATAACTC 
GCCTTATOCA 
GATACTT6AC 
ATGACCTGGT 
TGGAATGCTT 
CGGTTTTCCA 
TGATTGATPG 
TTTTTATCAT 
ATAAGATTAT 
TCATGGCTAT 



SI 
I 

TGACA6AGAA 
TCTTTTTAAT 
GAAGCAAGCT 
CTTTTCCCTT 
TTTGGTCAAT 
GTATCCTTTT 
TTTTCAAAGA 
TGGACTTTCC 
TGGAAAGGTC 
AAGGGCAATT 
GCCCAATGCC 
CTTCTTAGTT 
TATGTCCATC 
ATTTACTGGC 
AACATTTGGA 
TGTCACAAGA 
CATTGTTGTA 
CCTCGGGATA 
TCCATCAGCC 
GTCTTGTGTC 
TACAAATACT 



2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3160 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3'780 
3840 
3900 
3960 
4020 
4080 
4140 



PCTAJS02yi2476 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
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CAAGACTGCA CCCATGGGCA GGAAATGTTC TACTGCTTTC CTGACAATTT CTCTCTCACA 1320 
AATACCTCAG AGTCTCATGT TCAGCflGACA ACACAACTTT CTACTTTAAA TATTAGtATC 1380 
TTTCAATGA 



Seq ID NO: 669 Protein sequence 
Protein Accession ft: Eos sequence 



I II 21 31 41 51 

1 i 1 1 1 i 

KGYQRQEPVI PPQSDUJDRE TLV5EUEYKE KTOQSAALF7I WNSIIGSGI ZGLPYSMKQA 60 

GFPLGXLLLF WVSYVTDFSL VliLIXOQALS GTDTYQSLVN KIFCPPGYU. LSVLQFLYPP 120 

lAHISYKIIA GDTLSKVPQR IPGVDPaiVP XGRBFIIGLS TVTPTLPLSL YRNIAKZiGiCV 180 

SLISTGLTTL ILGIVMARAI SLGPHIPKTE DAHVFAKPNA IQAVGVMSPA FICHHSSFLV 240 

YSSLEBPTVA KWSRLIHMSI VISVFICIFP ATCX3YLTPTG FTQCa)IjFEHY CRNDDLVTFG 300 

RFCYGVTVIL TYPMECPVTR EVIANVFPGG HLSSVFHIVW TVMVITVATL VSLLIDCLGI 360 

VLELKGVliCA TPLIPIIPSA CyLia.SSEPa THSDKIMSCV MLPIGAWMV FGFVKAITNT 420 
QDCTHGOEMF YCFPDMFSLT NTSESHVQQT TQLSTLEIISI FQ 



Seq ID NO: 670 DNA sequence 

Kucleic Acid Accession #: Eos sequence 

Coding sequence: 1..1284 

1 11 21 31 41 51 

I I 1 1 I I 

ATGGGCTACC AGAGGCAGGA GCXTOTCATC C06CC6CAGA GAGGATTGCC TTATTCAATG 60 

AAGCAftGCTG GGTTTCCTTT GGGAATATTG CTTTTATTCT GGGTTTCATA TGTTACAGAC 120 

TTTTCCCTTG TTTTATTGAT AAAAGGAC5GG GCCCTCTCTG GAACAGATAC CTACCAGTCT 180 

TTGGTCAATA AAACTTTCGG CTTTOCAGGG TATCTGCTCC TCTCTOTTCT TCAGTTTTTG 240 

TATCCTTTTA TAGCAATGAT AAGTTACAAT ATAATAGCTG GAGATACTTT GAGCAAA8TT 300 

TTTCAAAGAA TCCCAGGAGT TGATCCTGAA AACGTGTTTA TTQGTCGCCA CTTCATTATT 360 

G6ACTTTCCA CAGTTACCTT TACTCTGCCT TTATCCTTGT ACOGAAATAT AGCAAAGCTT 420 

GGAAA06TCT CCCTCATCTC TACAGGTTTA ACAACTCTGA TTCTTGGAAT TGTAATGGC3V 480 

AGGGCAATTT CACTGGGTCC ACACATACCA AAAACAGAAG ACGCTTGGGT ATTTGCAAAG 540 

GCCAATGCCA TTCAAGCGGT CGGGGTTATG TCTTTTGCAT TTATTTGCCA CCATAACTCC 600 

TTCTTflGTTT ACAGTTCTCT AGAAGAACCC ACAGTAGCTA AGTGGTCCOG CCTTATCCAT 660 

ATGTCCATCG TGATTTCTGT ATTTATCTCT ATATTCTTTG CTACATGTGG ATACTTGACA 720 

TTTACTGGCT TCACCCAAGG GGACTTATTT GAAAATTACT 6CAGAAATGA TGACCT GGTA 780 

ACATTTGGAA GATTTTGTTA TGGTGTCACT GTCATTTTGA CATACCCTAT GGAATGCTTT 840 

GTGACAAGAG AGGTAATTGC CAATGTGTTT TTTGGTGGGA ATCTTTCATC GGTTTTOCAC 900 

ATTGTTGTAA CAGTGATGGT CATCACTGTA GCCAOGCTTG TGTCATTGCT GATTGATTGC 960 

CTCGGGATAG TTCTAGAACT CAATGGTGTG CTCTGTGCAA CTCCCCTCAT TTTTATCATT 1020 

CCATCAGCCT GTTATCTGAA ACTGTCTGAA GAAGCAAGGA CACACTCCGA TAAGATTATG 1080 

TCTTGTGTCA TCCTTCCCAT TGGTGCTGTG GTGATC6TTT TTGGATTOGT CATGGCTATT 1140 

ACAAATACTC AAGACTGCAC CCATGGGCAG GAAATGTTCT ACTGCTTTCC TGACAATTTC 1200 

TCTCTCACAA ATACCTCAGA GTCTCATGTT CAGCAGACAA CACaU^CTTTC TACTTTAAAT 1260 
ATTAGTATCT TTCAACTCGA GTAA 



Seq ID NO: 671 Protein sequence 
Protein Accession #; Eos sequence 



1 11 21 31 41 51 

I i I I I i 

MGYQRQEPVI PPQRGLPYSM KQAGFPLGIL LLPWVSYVTD PSLVLLIKGG ALSGTDTYQS 60 

LVNKTFGFPG YLhLSVLQPh YPFIAMISYN IIAGDTLSKV FQHXPGVDPE NVPIGRHPII 120 

GLSTVTFTIiP LSLYRNIAKL 6KVSLISTGL TTLILGIVr4A RAISLGPHIP KTEDAWVFAK 180 

PNAIQAVGVM SFAFICKHNS FLVYSSLEEP TVAKWSRLIH MSIVI5VFIC IFPATCGYLT 240 

FTGFTQGDLP ENYCRIJDDLV TPGRFCYGVT VILTYPMECF VTREVIANVP FGGNLSSVFH 300 

IWTVMVITV ATLVSLLIDC LGIVLELNGV LCATPLIFII PSACYLKLSE EPRTHSDKIM 360 

SCVMLPIGAV VMVFGFVMAI TNTQDCTHGQ EMFYCFPDNF SLTNTSESHV QOTTQLSTLN 420 
ISIFQLE 



Seq 10 KO: 672 DKA sequence 

Nucleic Acid Accession Eos sequence 

Coding sequence : 1 . . 1203 



I 11 21 31 41 51 

I I I I t 1 

ATGGGCTACC AGAGGCAGGA 6CCTGTCATC CCGCCGCAGT TTTCCCnGT TTTATT6ATA 60 

AAAGGAGGGG CCCTCTCTGG AACAGATACC TACCAGTCTT TGGTCAATAA AACTTTOGGC 120 

TTTCCAGGGT ATCTGCTCCT CTCTGTTCTT CAGTTTTTGT ATCCTTTTAT AGCAATGATA 180 

AGTTACAATA TAATAGCTGG AGATACTTTG AGCAAAGTTT TTCAAAGAAT CCCAGGAGTT 240 

GATCCTOAAA ACGTGTTTAT TGGTCGCCAC TTCATTATTG GACTTTCCAC AGTTACCTTT 300 

ACTCTGCCTT TATCCTTGTA CCGAAATATA GCAAAGCTTG GAAAGGTCTC CCTCATCTCT 360 

ACAGGTTTAA CAACTCTGAT TCTTGGAATT GTAATGGCAA GGGCAATTTC ACTGGGTCCA 420 

CACATACCAA AAACAGAAGA 0GCTTG6GTA TTTGCAAAGC CCAATGCCAT TCAAGOGGTC 480 

GGGGTTATGT CTTTTGCATT TATTTGCCAC CATAACTOCT TCTTAGTTTA CAGTTCTCTA 540 

OAAGAACCCA CAGTACCTAA GTGGTCCCGC CTTATCCATA TGTCCATCGT GATTTCPGTA 600 

TTTATCTGTA TATTCTTTGC TACATGTGGA TACTTGACAT TTACTGGCTT CACCCAAGGG 660 

GACTTATTTG AAAATTACTG CAGAAATGAT GACCTGGTAA CATTTGGAAG ATTTTGTTAT 720 

GGTGTCACTG TCATTTTGAC ATACCCTATG GAATGCTTTG TGACAAGAGA GGTAATTGCC 780 

A A TGTglTTT TTGGTGGGAA TCTTTCATCG GTTTTCCACA TTGTTGTAAC AGTGATGGTC 840 

ATCACTGTAG CCACGCTTGT GTCATTGCP6 ATTGATTGOC TGGGGATAGT TCTAGAACTC 900 

AATGGTGTGC TCTGTGCAAC TCCCCTCATT TTTATCATTC CATCAGCCTG TTATCTGAAA 960 

CTGTCTGAAG AACCAAGGAC ACACTCCGAT AAGATTATGT CTTGTGTCAT GCTTCCCATT 1020 

GGT6CTGTGG TGATGGTTTT TGGATTCGTC ATGGCTATTA CAAATACTCA AGACTGCACC 1080 

CATGGGCAGG AAATGTTCTA CTGCTTTCCT GACAATTTCT CTCTCACAAA TACCTCAGftG 1140 

TCTCATXJTTC AGCAGACAAC ACAACTTTCT ACTTTAAATA TTAGTATCTT TCAACTCGAG 1300 
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WO 02/086443 

TAA 

Seq ID NO: 673 Protein sequence 
protein Accession 8: Eos sequence 



PCTAJS02yi2476 



1 
! 

MGYQRQBPVI 
SYNIIAGDTL 
TGLTTLILGI 
E2?TVAKWSa 
GVTVILTYPM 
HGVLOVTPLI 
BOQEMFYCFP 



11 
I 

PPQFSLVt*LI 
SKVFQRIPGV 
VHASAISLGP 
LIHMSIVISV 
BCFVTREVIA 
FZIPSACYLK 
DKFSLTMTSE 



21 
I 

KGGALSGTDT 
DPENVFIGRH 
HIPKTEDAHV 
FXClFFATtS 
HVFFGGNLSS 



SKVQQTTQLS 



31 
I 

YQSLWKTFG 
FIIGLSTVTF 
FAXPNAIQAV 
YLTFTGPTQG 
VFHIWTVMV 
KZMSCVha.PX 
TLirXSIFQLB 



41 
I 

FPGYLLLSVL 
TLPLSLYRMI 
GVKSFAFICH 
DLFQiYCRHD 
ITVATLVSOiL 
GAWMVFGFV 



51 
I 

QPLYPFIAMI 
AKLGBCVSLIS 
HNSFLVYSSL 
DI.VTPGRFCY 
ZDCLGIVL&L 
MAITNTQDCT 



Seq ZO KO: 674 mth sequence 

Kucleic Acid Accession «: Sos sequence 

Coding sequence: 1..1140 



1 
I 

AT6GGCTACC 
CCAGGGTATC 
TACAATATAA 
CCTGAAAAOG 
CTGCCTTTAT 
GGTTTAACAA 
ATACCAAAAA 
GTTAT6TCTT 
GAACCCACAG 
ATCTGTATAT 
TTATTTGAAA 
GTCACTGTCA 
CTGTTTTTTG 
ACTGTAGCCA 
GGTGTGCTCT 
TCTGAAGAAC 
GCTGTGpGTGA 
GGGCAOGAAA 
CATGTTCAGC 



11 
1 

AGAGGCAGGA 
TGCTCCTCTC 
TAGCTGGAGA 
TGTTTATTGG 
OCTTGTACOG 
CTCTGATTCT 
CAGAAGACGC 
TTGCATTTAT 
TAGCTAAGTG 
TCTTTGCTAC 
ATTACTGCAG 
TTTTGACATA 
GTGGGAATCT 
CGCTTGTGTC 
GTGCAACTCC 
CAAGGACACA 
TGGTTTTTGG 
TGTTCTACTG 
A6ACAACACA 



21 
I 

GCCTGTCATC 
TGTTCTTCAG 
TACTFTGAGC 
TCGCCACTTC 
AAATATAGCA 
TGGAATTGTA 
TTGGGTATTT 
TTGCCACCAT 
GTCCCGCCTT 
ATGTGGATAC 
AAATGATGAC 
CCCTATGGAA 
TTCATCGGTT 
ATTGCTGATT 
CCTCATTTTT 
CTCCQATAAG 
ATTOGTCATG 
CTTTCCTGAC 
ACTTTCTACT 



31 
I 

CCGCOGCAGG 
TTTTTGTATC 
AAAGTTTTTC 
ATTATTCGAC 
AAGCTTGGAA 
ATGGCAAGGG 
GCAAAGCCCA 
AACTCCTTCT 
ATCCATATGT 
TTGACATTTA 
CTGGTAACAT 
TGCTTTGTGA 
TTCCACATTC 
GATTGCCTOG 
ATCATTCCAT 
ATTATGTCTT 
GCTATTACAA 
AATTTCTCTC 
TTAAATATTA 



41 
I 

TCAATAAAAC 
CTTTTATAGC 
AAAGAATOCC 
TTTCCACAGT 
AGGTCTCCCT 
CAATTTCACT 
ATGCCATTCA 
TAGTTTACAG 
CCATOGTGAT 
CTGGCTTCAC 
TTGGAAGATT 
CAAGAGAGGT 
TTGTAACAGT 
GGATAGTTCT 
CAGCCTGTTA 
GTGTCATGCT 
ATACTCAAGA 
TCACAAATAC 
GTATCTTTCA 



Seq ZD MO: 675 Protein sequence 
Protein Accession ft: Eos sequence 



1 
I 

MGYQRQBPVI 
PENVFIGRHF 
IPKTEDAWVF 
ICIFPATGGY 
VFFQGULSSV 



HVQQTTQLST 



11 
I 

PPQVNKTFGF 
IIGLSTVTFT 
AKPNAIQAVG 
LTFTGFTQGD 
FHIWTVMVI 
ZMSCVMLPZG 
UfZSZFQLE 



21 
I 

PGYLLLSVliQ 
LPLSLYRNIA 
VMSFAFXCHH 
liFEMYCRNSD 
TVAT£*V5LLZ 
AWMVFGFVM 



31 
I 

PLYPPIAMIS 
KLGKVSLIST 
KSFLVYSSIiE 
LVTFGRFCYG 
OOiGZVLELN 
AITNTQDCTH 



41 
I 

YNIIAGDTLS 
GLTTLIIjGIV 
EPTVAKWSRL 
VTVILTYPKB 
GVZ.CATPl.IF 
GQ94FYCFPD 



51 
I 

TTTCGGCTTT 
AATGATAAGT 
AGGAGTTGAT 
TACCTTTACT 
CATCTCTACA 
GGGTCCACAC 
AGCGGTCGG6 
TTCTCTAGAA 
TTCTGTATTT 
CCAAGGGGAC 
TTGTTATGGT 
AATTGCCAAT 
GATGGTCATC 
AGAACTCAAT 
TCTGAAACTG 
TCCCATTGGT 
CTGCACCCAT 
CTCAGAGTCT 
ACTOGAGTAA 



SI. 
1 

KVPQRIPGVD 
MARAISLGPH 
ZHMSrvlSVP 
CFVTREVZAH 
XIPSACYUa 
NPSLTNTSBS 



Seq ID NO: 676 DNA sequence 

»ucleic Acid Accession MM_006853.1 

Coding sequence; 26. -fi74 



AGGAATCTGC 
ATCX3GGCAGA 
CATGAGGATT 
CAGGATCATC 
GGAGAAGACG 
AGCCCACTGC 
GGAGGGCTGT 
CAGCCTCCCC 
CTOCATCACC 
CAGCTGCCTC 
CTTGCGATGC 
CAACATCACA 
GGGTGACTCC 
CCAGGATCCG 
GGACTGGATC 
ACCCTCCATT 
CAAGACCCTC 
AATCAACCTG 
GACTCTGGGA 
TCCTGGCCAT 



11 
I 

GCTCGGGTTC 
GGTCTCACAG 
CTGCAGTTAA 
AAGGGGTTCG 
CGGCTACTCT 
CTCAAGCCCC 
GAGCAGACCC 
AACAAAGACC 
TOGGCTGTGC 
ATTTCCGGCT 
GCCAACATCA 
GACACCATGG 
GGGGGCCCTC 
TGTGCGATCA 
CAGGAGACGA 
TCCACTTGGT 
TACGAACATT 
GGGTTCGAAA 
ATGACAACAC 
ATATCAAGGT 



21 
I 

CGCAGATGCA 
CAGCCAAGGA 
TCCTGCTTGC 
AGTGCAAGCC 
GTGGGGCGAC 
GCTACATAGT 
GGACAGCCAC 
ACCGCAATGA 
GACCCCTCAC 
GGG6CAGCAC 
CCATCATTGA 
TGTGTGCCAG 
TGGTCTGTAA 
CCCGAAAGCC 
TGAAGAACAA 
GTTTGGTTOC 
CTTTGGGCCT 
TCAGTGAGAC 

TTCAATAAAT 



31 

I 

GAGGTTGAGG 
ACCTGGGGCC 
TCTGGCAACA 
TCACTCCCAG 
GCTCATCGCC 
TCACCTGGGG 
TGAGTCCTTC 
CATCATGCTG 
CCTCTCCTCA 
GTCCA6CCCC 
GCACCA6AAG 
CGTGCAGGAA 
CCAGTCTCTT 
TGGTGTCTAC 
TTAGACTGGA 
TGTTCACTCT 
CCTGGACTAC 
CTGGATTCAA 
CTCTGTTGTA 
ATTTGCTAAA 



41 

I 

TGGCTGCGGG 
OGCTCCTCCC 
OOGCTTGTAG 
CCCTGGCAGG 
CCCAGATGGC 
CAGCACAACC 
CCCCACCCCG 
GTGAAGATGG 
CGCTGTGTCA 
CAGTTAOGCC 
TGTGAGAACG 
GGGGGCAAGG 
CAAGGCATTA 
ACGAAAGTCT 
CCCACCCACC 
GTTAATAAGA 
AGGAGATGCT 
ATTCTGCCTT 
TCCCCAGCCC 
TGAGTG 



Seq ZD KO: 677 Protein sequence 
Protein Accession »: l«P_006844.1 

11 



60 
120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 



60 
120 
180 
240 
300 
360 



51 
I 

ACTGGAAGTC 
CCCTCCAGGC 
GGGGAGAGAC 
CAGCCCTGTT 
TCCTGACAGC 
TCCAGAAGGA 
GCTTCAACAA 
CATCGCCAGT 
CTGCTGGCAC 
TGCCTCACAC 
CCTACCCGGG 
ACTCCTGCCA 
TCTCCTGGGG 
GCAAATATGT 
ACAGCCCATC 
AACCCTAAGC 
GTCACTTAAT 
GAAA7ATTGT 
CAAAGACAGC 



60 
120 
180 
240 
300 
360 
420 
480 
S40 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



51 



1 . 11 21 31 41 ^ 

MRILQWUiA LATGLVGGET RIIKGFECKP HSQPWQAALP EKTBLLOGAT LZAPRWLLTA 



60 



443 



wo 02/086443 

AHCLKPRYIV HIiGQHNLQKE EGCEXOTTAT ESFPHPGEKH SLPKKMWJD ^W-^^ASPV 120 
SI^^PLT l^CmSH SCLISGWGST SSPQLRLPHT lACAKITIlE HQKCEMftWG 180 
StOTIW^ vSgGKDSCO CTSGGPLVO* QSLQGIISKG QDPCMTRKP GWKVOaV 240 
DHIQETMroiS 

Seq ID KO: 678 DHA sequence 

Nucleic Acid Accession ft: Eos sequence 

Coding sequence i 1..933 



1 XI 21 



31 41 51 



i 1 

ItGTGCAGCA i-reGAOGGTG kTCOTGGC GCCTGGCMTT GTGACGGGCT GOTGACT^ 60 

TTCGACAAGA GTGATGAGAA GGAGTGCCCC AAGGCTAAGT CGAAATCPGO CCC GAOCT TC 120 

SS^SSS SaGCGCXIAT CCATTGCATC ATTGCm^ "0 

GACTGTCCCG ATGGCAGCGA TCAAGAGAAC TGCACAGCAA ACCCTCTGCT TTGCTCCACC 240 

SctSa^ CGGCCTCTCT ATTGACAAGA GCTTCATCre CGATGGACAG 300 

^SSrc SSaCAACAC TCATGAGGAA AGCItSTCAAA GTTCTCAAGA ACCCGGCAGT 360 
SSSS ^J^CTTC AGAGAACCAA CTT(^^ 

ATCATraGCA GCTCCGTCAT TTTTGTGCTG GTGGTGGCCC TGCXGGCACT GGTCTTGCAC 480 

S^SS^ S^SS CCTCATGACX. CTGCC^ 540 

CTGCTGTCCC GCCTGGTGGT CCTGGACCAC CCCCACCACT GCAACGTCAC CTACAAOGTC 600 

AATAATGGCA TCCAGTATGT GGCCAGCCAG GOGGAGCAGA ATGCGTCGGA AGTAGGCTCC 660 

JSS^St ACTOTAGGC CTTCCTGGAC CAGAGG0CIX3 CGTGGTATGA CCTTCCTCCA 720 

Si^SSc GGAATCTCTG AAOCAAGOT ^ 

CGGTCOSGGA GTGCCAACAG TGCCAGCTCC CAGGCAGCCA GCAGCCTCCT <3AG0GTGGAA 840 

GACaSa^ ACAGCCCGGG GCAGCCTGGC CXCCAGGAGG GCACTGCTOA GCCCAGGGAC 900 
TCTGAGCCGA GCCAGGGCAC TGAAGAAGTA TAA 

seq ID NOs . 679 Protein sequence 
Protein Accession ft: Eos sequence 



60 
120 
leo 
240 
300 



1 11 21 31 41 51 

icsKGRCiPG Lqcdglpdc fdksdbkecp kakskcgptf fpcasgihci igrprcngfe 

DCTDGSDEEN CTANPLLCST ARYHCKNGLC IDKSPICDGQ NNCQDNSDEE SCESSQEPGS 
GQVPVTSENQ LVYYPSITYA IIGSSVIFVL WALIALVUI HQRKRNNLMT LPVHW^HPV 
LLSRLWUW PHKQIVTYMV NNGIQYVASQ ABQNASEVGS PPSYSEALLD QRPAWYDLPP 
PPYSSDTBSL NQADLPPYRS RSGSANSASS QAASSLLSVE DTSHSPGQPG PQE6TAEPR0 
SEPSQGTEEV 

Seq ID NO: 680 DKA sequence 
Nucleic Acid Accession 8: S78203.1 
Coding sequencet 1..2190 

1 11 21 31 41 51 

ATGAATCCTT TCCAGAAAAA TGAGTCCAAG GAAACTCTTT TTTCACCTGT CTCCATXGAA 60 

GAGGTACCAC CTCGACCACC TAGCCCTCCA AAGAAGCCAT CTCCGACAAT CTGTGG CTCC 120 

AACTATCCAC TGAGCATTGC CTTCATTGTG GTGAATGAAT TCTGCGAGCG CTTTTCCTAT 180 

TATGGAATGA AAGCTGTGCT GATCCTGTAT TTCCTGTATT TCCTGCACTG GAATGAAGAT 240 

ACCrcCACAT CTATATACCA TGCCTTCAGC AGCCTCTGTT ATTTTACTCC CATCCTGGGA 300 

GCAGCCATTG CTGACTCGTG GTTGGGAAAA TTCAAGACAA TCATCTATCT CTCCTTGGTO 360 

TATGTGCTTG GCCATGTGAT CAAGTCCTTG GGTGCCTTAC CAATACT6GG AGGACAAGTG 420 

GTACACACAG TCCTATCATT GATCGGCCTG AGTCTAATAG CTTTGGGGAC AGGAGGCATC 480 

AAACCCTGTG TGGCAGCTTT TGGTGGAGAC CAGTTTGAAG AAAAACATGC AGAGGAACGG 540 

ACTAGATACT TCTCAGTCTT CTACCTGTCC ATCAATGCAG GGAGCTTCAT TTCTACATTT 600 

ATCACACCCA TGCTGAGAGG AGATGTGCAA TGTTTTGGAG AAGACTGCTA TGCATTGGCT 660 

TTTGGAGTTC CAGGACTGCT CATGGTAATT GCACTTGTTG TGTTTGCAAT GGGAAGCAAA 720 

ATATACAATA AACCACCCCC TGAAGGAAAC ATAGTGGCTC AAGrTTTCAA ATGTATCTGG 780 

TTTGCTATTT CCAATCGTTT CAAGAACCGT TCTGGAGACA TTCCAAAGCG ACAGCACTGG 840 

CTAGACTGGQ CAGCTGAGAA ATATCCAAAG CAGCTCATTA TGGATGTAAA GGCACTGACC 900 
AGGGTACTAT TCCTTTATAT OCCATTGCCC AT6TTCTGGG CTCTTTTGGA TCAGCAGGGT 
TCACGATGGA CTTTCCAAGC CATCAGGATG AATAGGAATT TGGGGTTTTT TGTGCTTCAG 
CCGGACCAGA TGCAG(rrTCT AAATCCCTTT CTG6TTCTTA TCTTCATCCC GTT^ 

TTTGTCATTT ATCGTCTGGT CTCCAAGTGT GGAATTAACT TCTCATCACT TAOWAAATC 1140 

GCTGTTGGTA TGATCCTAGC GTGCCTGGCA TTTGCAGTTG OGGCAGCTGT AGAGATAAAA 1200 

ATAAATCAAA TGGCCCCAGC CCAGTCAGGT CCCCAGGAGG TTTTCCTACA AGTCTTGAAT 1260 

CTGGCAGATG ATGAGGTGAA GGTGACAGTG GTGGGAAATG AAAACAATTC TCTGTTGATA 1320 

GAGTCCATCA AATCCTTTCA GAAAACACCA CACTATTCCA AACTGCACCT GAAAACAAAA 1380 

AGCCAGGATT TTCACTTCCA OCTGAAATAT CACAATTTGT CTCTCTACAC TGAGCATTCT 1440 

GTGCAGGAGA AGAACTGGTA CAGTCTTGTC ATTCGTGAAG ATGGGAACAG TATCTCCAGC ISOO 

ATGATGGTAA AGGATACAGA AAGCAAAACA ACCAATGGGA TGACAACCGT GAGGTTTGTT 1560 

AACACTTTGC ATAAAGATGT CAACATCTCC CTGAGTACAG ATACCTCTCT CAATOTTG6T 1620 

GAAGACTATG GTGTGTCTGC TTATAGAACT GTGCAAAGAG GAGAATACCC TGCAGTGCAC 1680 

tGTAGAACAG AAGATAAGAA CTTTTCTCTG AATTTGGGTC TTCTAGACTT TGGTGCAGCA 1740 

TATCTGTTTC TTATTACTAA TAACACCAAT CAGGGTCTTC AGGCCTGGAA GATTGAAGAC 1800 

ATTCCAGCCA ACAAAATGTC CATTGCGTGG CAGCTACCAC AATATGCCCT GGTTACAGCT 1860 

GGGGAGGTCA TGTTCTCTGT CACAGGTCTT GAGTTTTCTT ATTCTCAGGC TCCCTCTAGC 1920 

ATGAAATCTG TGCTCCAGGC AGCTTGGCTA TTGACAATTG CAGTTGGGAA TATCATCGTG 1980 

CTTGTTGTCG CACAGTTCAG TGGCCTCGTA CAGTGGGCCG AATTCATTTP GTTTTCCTGC 2040 

CTCCTGCTGG TGATCTGCCT GATCTTCTCC ATCATGGGCT ACTACTATGT TCCTGTAAAG 2100 

ACAGAG6ATA TGCGGGGTCC AGCAGATAAG CACATTCCTC ACATCCAGGG GAACATGATC 2160 
AAACTAGAGA CCAAGAAGAC AAAACTCTGA 

seq ID NO: 681 Protein sequence 
Protein Accession ft: AAB34388.1 



960 
1020 
1080 



444 



wo 02/086443 

1 11 21 

i I 1 

KSIPFQIC3SSR BtLFSFVSIB EVPPRPPSPP 
YGMKAVLXLY PLYFLHHNSJ TSTSIYHAPS 
YVLGHVIKSL GALPILGGOV VHTVLSLIGL 
TRYTSVFTLS INAGSLISTF ITPWL2GDVQ 
jTtNKPPPEGN IVAQVPKCIH FAISNRFKNR 
RVLPLYIPLP MFWALLDQQG SRHTLQAIRM 
FVIYRLVSKC GINFSSLRKH AVOlILACLA 
LADDEVKVTV V<SimNShLI ESIKSPQKTP 
VQBKNWYSLV IBEDGNSISS KMVKDTESKT 
EDYGVSAYRT VQRGEYPAVH CRTEDKKPSI. 
IPANXMSZAM QLPQYALVTA GCVMFSVTGL 
LWAQFSGLV QWAEFILFSC LLLVICLIFS 
KLSTKKTKL 



31 41 51 

t I I 

KKPSPTICG5 SYPLSIAPIV VHEPCBRPSY 
SLCYPTPILG AAIADSKUSK FKTIIYIiSLV 
SLIALGTGGI KPCVAAFGGD QFBBKHABER 
CFGEDCYALA FGVPGLUflVI A1>\A^AMGSK 
SGDIPKRQHW XiDWAAEKYPK QLIMDVKALT 
N2NIiGFFVIiQ PDQMQVI^PP LVLIFIPLFD 
FAVAAAVEIK INEMAPAQSG PQEVFLQVLN 
HYSKLHLKTK SQDFHPHLKY ffiJLSLYTEHS 
ntGMTTVRFV ZTTLHXDVKIS LSTDTSLMVG 
MLGIxLDFGAA YLFVITNNTN QGLQAWKIED 
EFSYSQAPSS MKSVLQAAHL LTZAVGNIIV 
IMGYYYVPVK TEDMRGPADK HIPHIQGMMI 



Seq 10 NO: 682 DMA sequence 

Nucleic Acid Accession ft: NM_016077.1 

Coding sequence-. 12 8.. 6 67 

I 11 21 31 41 51 

111 111 

TCGCTTTGTG ATTCTTCATC CGGAACTTTC TCACCCAGGA ACCCOGGAAG AGGTAGCTCA 
CGCGATAGAA AGErTGTTCGC TTGCCCAGAA GAAGGGAAGG CGOGAGTGAG GAAAGGRGGT 
ACTGTAGATG CCCTCCAAAT CCTTGGTTAT GGAATATTTG GCTCATCCCA GTAC ACTOGG 
CTTGGCTGTT GGAGTTGCTT GTGGCATGTG CCTGGGCTGG AGCCTTCGAG TATGCTTTGG 
GATGCTOCXX: AAAAGCAAGA OGAGCAAGAC ACACACAGAT ACTGAAAGTG AAGCAAGCAT 
CTTGGGAGAC AGCGGGGAGT ACAAGATGAT TCTTGTGGTT CGAAATGACT TAAAGATGGG 
AAAAGGGAAA C3X3GCTCCCC AGTGCTCTCA TGCTGCTGTT TCAGCCTACA A GCAGATTCA 
AAGAAGAAAT OCTGAAATGC TCAAACAATG GGAATACTGT GGCCAGOOCA AGGTGGTGGT 
CRAAGCTCCT GATGAAGAAA CCXTTGATTGC ATTATTGGCC CATGCAAAAA TGCTGGGACT 
GACTGTAAGT TTAATTCAAG ATGCTGGACG TACTCAGATT GCACCAGGCT CTCAAACTGT 
CCTAGGGATT GGGCCACGAC CAGCAGACCT AATTGACAAA GTCACTGGTC ACCTAAAACT 
TTACTAGGTG GACTTTOATA TGACAACAAC CCCTCCATCA CAAGTGTTTG AAGCCTQTCA 
GATTCTAACA ACAAAAGCTG AATTTCTTCA COCAACTTAA ATGTTCTTGA GATGAAAATA 
AAACCTATTC CCATGTTCTA AAAAAA 



Seq ID KO; 683 Protein sequence 
Protein Accession 6; NP__057161.1 

1 11 21 

I I I 

KPSKSLVKEY LAHPSTLGIA VGVAG6MCLG 
OSGEYKMILV VRKDLKMGKG KVAAQCSHAA 
PDBETLIAI.L AKAKMLGLTV SLIQDAGRTQ 



31 41 51 

I I I 

WSLHVCPGML PRSKTSKTHT DTBSEASILG 
VSAYKQIQRR NPEKLKQWEY CGQPKVWKA 
lAPGSQTVLG IGPGPADLID KVTGHLKLY 



Seq ZD NO: 684 DNA sequence 

Nucleic Acid Accession »i NM_004S64.1 

Coding sequence: 26*. 952 

1 11 21 31 41 51 

III II! 

CGGAACGAGG GCAACCTGCA CAGCCATGCC CGGGCAAGAA CTCAGGACGG TGAATGGCTC 
TCAGATGCTC CTGGTGTTGC TGGTGCTCTC GTGGCTGCCG CATGGGGGCG CCCTGTCTCT 
GGC0GAG6CG AGCOGCGCAA GTTTCCCGGG ACCCTCRGAG TXGCACTCCG AAGACTCCAG 
ATTCCGAGAG TTCCGGAAAC GCTACGAGGA CCTGCTAAOC AGGCTGOGGG CCAACCAGAG 
CTGGGAAGAT TCGAACACCG ACCTCGTCCC GGCCCCTGCA GTCOGGATAC TCACGCCAGA 
AGTGCX3GCTG GGATCCGGC6 GCCACCTGCA CCTGOGTATC TCTCX3GGC0G CCCTTCCCGA 
GGGGCTCCCC GAGGCCTCCC GCCTTCACCG GGCTCTGTTC CGGCTGTCCC CGACGGCGTC 
AAOGTCGTGG GACGTGACAC GACCGCTGOG GCGTCAGCTC AGCCTTGCAA GACCCCAAGC 
GCCCGCGCTG CACCTGCGAC TGTCGCCGCC GCCGTCGCAG TCGGACCAAC TGCTGGCAGA 
ATCTTCGTCC GCACGGCCCC AGCTGGAGTT GCACTTGCGG OOGCAAGCCG CCAGG GGGCG 
CGGCAGAGCG CGTGCGCGCA ACGGGGAOGA CTGTCCGCTC GGGOCCGGGC GTTGCTGCCG 
TCTGCACACG GTCCGCGCGT CGCTGGAAGA CCTGGGCrGG GCGQATrGGG TGCTGTCGCC 
AOGGGAGGTG CAAGTGACCA TGTGCATCGG OSCGTGCCCXS AGCCAGTTCC GGGCGGCAAA 
CATCCACGCG CAGATCAAGA OGAGCCTGCA CCGCCTGAAG CCCGACACGG AGCCAGCGCC 
CTGCTGOGTC CCCGCCACCT ACAATCCCAT GGTGCTCATT CAAAAGACCG ACACOGGGGT 
OTOGCTCCAG ACCTATGATG ACTTGTTAGC CAAAGACT6C CACTGCATAT GAGCAGTCCT 
GGTCCTTCCA CTGTGCACCT GCGCGGGGGA GGOGACCTCA GTTGTCCT6C CCTGTGGAAT 
GGGCTCAAGG TTCCTGAGAC ACCCGATTCC TGCCCAAACA GCTGTATTTA TATAAGTCTG 
TTATTTATTA TTAATTTATT GGGGTGACCT TCTTGGGGAC TCGGGGGCTG GTCTGATGGA 
ACTGTGTATT TATTTAAAAC TCTGGTGATA AAAATAAAGC TGTCTGAACT GTTAAAAAAA 
AAAA 



Seq ID NO: 685 Protein sequence 
Protein Accession ft: NP_004655.1 

1 11 21 31 41 51 

I 1 1 i 1 1 

MPGQELRTVN GSQMLLVLLV LSWLPHGGAI* SLAEASRASF PGPSELHSED SRFRKIiRKRY 
EDLLTRLRAN QSWEDSNTDL VPAPAVRILT PEVRLGSGGH UJLRISHAAI, PEGLPEASRL 
HRALPRLSPT ASRSWDVTRP LRRQLSLARP QAPALHLRLS PPPSQSDQLL ABSSSARPQL 
ELHLRPQAAR GRRRARARNG DDCPLGPGRC CRLHTVRASL EDLGWADWVL SPREVQVTMC 
IGACPSQFRA ANMHAQIKTS LHRLKPDTEP APCCVPASW PMVLIQXTDT GVSLQTYDDL 
LAKDCHCI 



Seq ID NO: 686 DNA sequence 
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Nucleic Acid Accession ft: HM_002423.2 
Coding sequence: 48.. 851 

^1 11 21 31 41 51 

S I ! I I 1 I 

ACCAAATCAA CCATAGGTCC AAGAACAATT GTCTCTGGAC GGCAGCTATG OQACTCACCG 60 

TGCTGTGTGC TGTGTGCCTG CTGCCTQGCA GCCTGGCCCT GCX3GCTGCCT CAGGAGGCX3G 120- 

GAGGCATGAG TGAGCTACAG TGGGAACAGG CTCAGGACTA TCTCAAGAGA TTTTATCTCT 180 

^ ATGACTCAfiA AACAAAAAAT GCCAACAGTT TAiGAAGCCAA ACTCAAGGAG ATGCAAAAAT 240 

10 TCTTTGGCCT ACCTATAACT GGAATGTTAA ACTOCOGCGT OVTAGAAATA ATGCAGAAOC 300 

CCAGATGTG6 AOTGCCAGAT GTTGCAGAAT ACTCACtATT TGCAAATAGC OCAAAAtXSGA 360 

CTTCCAAAGT GGTCACCTAC AGGATOGTAT CATATACTOG AGACTTACOS CATATTACAG 420 

TGGATCGATT AGTGTCAAAG CCTTTAAACA TGTGGGGCAA AGAGATCOCC CTGCATTTCA 480 

GGAAAGTTGT ATGGGGAACT GCTGACATCA TGATTGGCTT TGCGOGAGGA GCTC ATGGGG 540 

15 ACTCCTACCC ATTTGATGGG CCAGGAAACA OGCTGGCTCA TGCCTTTGOG CCTGGGACAG 600 

GTCTOGGAGG AGATGCTCAC TTCGATGAGG ATGAAOGCTG GACGGATGGT AGCAGTCTAG 660 

GGATTAACTT CCTGTATGCT GCAACTCATG AACTTGGCCA TTCTTTGGGT ATGGGACATT 720 

CCTCTGATCC TAAT5CAGTG ATGTATCCAA CCTATGGAAA TGGAGATCOC CAAAATTTTA 780 

AACTTTCCCA GGATGATATT AAAGGCATTC AGAAACTATA TGGAAAGAGA AGTAATTCAA 840 

20 GAAAGAAATA GAAACTTCAG GCAGAACATC CATTCATTCA TTCATTGGAT TGTATATCAT 900 

TGTTCCACAA TCAGAATTGA TAAGCACTGT TCCTCCACTC CATTTAGCAA TTATGT CACC 960 

CTTTTTTATT GCAGTTQGTT TTTGAATGTC TTTCACTCCT TTTATTGGTT AAACTCCTTT 1020 

ATGGTGTGAC TGTGTCTTAT TCCATCTATC ACCTTTGTCA GTG060QTAG ATGTGAATAA 1080 
ATCTTACATA CACAAATAAA TAAAATGTTT ATTOCATGGT AAATTTA 



25 



40 



60 



Seq ID NO: 687 Protein sequence 
Protein Accession fi: KP 002414.1 



1 11 21 31 41 51 

30 1 I 1 1 I I 

MRLTVLCAVC LLPGSLAIiPIj PQEAGGMSEL QWEQAQDYLK RPYLYDSETK NANSLEAKLK 60 
EMQKFFGLPI TGMLNSRVIE IMQKPROSVP DVAEYSLFPU SPKWTSKWT YRIVSYTRDL 120 
PHITVDSLVS KALNMWGKEI PLHFRKWWG TADIMIGFAR GAKGDSYPFD GPGNTLAHAP 180 
APGTGLGGDA KPOEDERWTD GSSIiGINFLY AATREIiGaSL GM6HSSDPNA VWPTYGNGD 240 
35 PQNFKLSQDO IKGZQKLYGK RSNSRKK 



Seq ID NO: 688 DKA sequence 

Nucleic Acid Accession ft: NN_005221.3 

Coding sequence: l..e70 



1 11 21 31 41 51 

I I I i I I 

ATGACAGGAG TGTTTGACAG AAGGGTCCCC AGCATCCGAT CCGGCGACTT CCAAGCTCCG 60 

TTCCAGACGT CCGCAGCTAT GCACCATCCG TCTCAGGAAT CGCCAACTTT GCCG6AGTCT 120 

45 TCAGCTACCG ATTCTGACTA CTACAGCCCT ACXK3GGGGAG CCCCGCAOGG CTACTGCTCT 180 

CCTACCTCGG CTTCCTATGG CAAAGCTCTC AACCCCTACC AGTATCAGTA TCACGGCGTG 240 

AACGGCTCCG COGGGAGCTA CCCAGCCAAA GCTTATGCCG ACTATAGCTA CGCTAGCTCC 300 

TACCACCAGT ACGGCGGCGC CTACAACCGC GTCCCAAGGG CCACCAACCA GCCAGAGAAA 360 

GAAGTGACC6 AGOCOSAGGT GAGAATGGT6 AATG6CAAAC CAAAGAAAGT TOGTAAACOC 420 

50 AGGACTATTT ATTCCAGCTT TCAGCTGGOC GCATTACAGA GAAGGTTTCA GAAGACTCAG 480 

TACCTCGCCT TGCCGGAACX5 CGCCGAGCTG GOXSCCTCGC TGGGATTGAC ACAAACACAG 540 

GTGAAAATCT GGTTTCAGAA CAAAAGATCC AAGATCAAGA AGATCATGAA AAACGGGGAG 600 

ATGCCCCCGG AGCACAGTCC CAGCTCCAGC GACCCAATGG CGTGTAACTC GCCGCAGTCT 660 

CCAGCOGTGT GGGAGGCCCA GGGCTCGTCC CGCTCGCTCA 6CCACCACCC TCATGCCCAC 720 

55 CCTC06ACCT CCAACCAGTC CGCAGCGTCC AGCTACCTGG AGAACTCTGC ATCCTGGTAC 780 

ACAAGTGCAG CCAGCTCAAT CAATTCCCAC CTGOOGCOGC OGGGCTCCTT ACAGCACC06 840. 
CTQGCGCTGG CCTCOGGGAC ACTCTATTAG 



Seq ID NOi 689 Protein sequence 
Protein Accession ft: NP_00S212.1 



I 11 21 31 41 51 

I i i ( t i 

MTGVFDRRVP SIRSGDFQAP PQTSAAMHHP SQESPTLPES SATDSDYYSP TX3GAPHGYCS 60 
65 PTSASYGKAL NPYQYQYKGV NGSAGSYPAK AYAZ>YSYASS YHQYGGAYNR VPSATNQPEK 120 
EVTEPEVRMV NGKPKKVRKP RTIYSSPQI*A ALQRRFQKTQ YLALPERAEIi AASLGLTQTQ 180 
VKIWFQKKRS KlfOaHKNGE MPPEHSPSSS DPMACNSPQS FAVWEPQGSS RSLSMUPUAH 240 
PPTSNQSPAS SYLEHSASWY TSAASSINSH LPPPGSLQHP LALASGTLY 
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It is understood that the examples described above in no way serve to limit the true 
scope of this invention, but rather are presented for illustrative purposes. All publications, 
sequences of accession numbers, and patent applications cited in this specification are herein 
jicorporated by reference as if each individual publication or patent application were 
specifically and individually indicated to be incorporated by reference. 
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WHAT IS CLAIMED IS: 

1 LA method of detecting a lung cancer-associated transcript in a cell 

2 fiom a patient, the mettiod comprising contacting a biological sample from the patient with a 

3 polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

4 as shown in Tables 1 A-16. 

1 2. The method of claim I, wherein the polynucleotide selectively 

2 hybridizes to a sequence at least 95% identical to a sequence as shown in Tables 1 A-16. 

1 3. The method of claim 1, wherein the biological sample is a tissue 

2 sample. 

1 4. The method of claim 1 , wherein the biological sample comprises 

2 isolated nucleic acids. 

1 5. The method of claim 4, wherein the nucleic acids are mRNA. 

1 6. The method of claim 4, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide. 

1 7. The method of claim 1 , wherein the polynucleotide comprises a 

2 sequence as shown in Tables I A-16, 

1 8. The method of claim I, wherein the polynucleotide is labeled. 

1 9. The method of claim 8, wherein the label is a fluorescent label. 

1 10. The method of claim 1, wherein the polynucleotide is immobilized on 

2 a solid surface. 

1 11. The method of claim 1 , wherein the patient is undergomg a therapeutic 

2 regimen to treat lung cancer. 

1 12. The method of claim 1, wherein the patient is suspected of having lung 

2 cancer. 

1 1 3 , A method of monitoring the efficacy of a therapeutic treatment of lung 

2 cancer, the method comprising the steps of: 
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3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a lung cancer-associated transcript in the 

6 biological sample by contacting the biological sample with a polynucleotide that selectively 

7 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1 A-16, 

8 thereby monitoring the efficacy of the therapy. 

1 14. The method of claim 13, further comprising the step of: (iii) comparing 

2 the level of the lung cancer-associated transcript to a level of the lung cancer-associated 

3 transcript in a biological sample from the patient prior to, or earlier in, the therapeutic 

4 treatment. ... 

1 15. The method of claim 1 3, wherein the patient is a human. 

1 1 6. A method of monitoring the eflScacy of a therapeutic treatment of lung 

2 cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient imdergoing the ther^eutic 

4 treatment; and 

5 (ii) determining the level of a limg cancer-associated antibody in the biological 

6 sample by contacting the biological sample with a polypeptide encoded by a polynucleotide 

7 that selectively hybridizes to a sequence at least 80% identical to a sequence as shown in 

8 Tables 1 A-16, wherein the polypeptide specifically binds to the lung cancer-associated 

9 antibody, thereby monitoring the efficacy of the therapy. 



1 1 7. The method of claim 1 6, further comprising the step of: (iii) comparing 

2 the level of the lung cancer-associated antibody to a level of the lung cancer-associated 

3 antibody in a biological sample from the patient prior to, or earlier in, the therapeutic 

4 treatment. 

1 18. The method of claim 1 6, wherein the patient is a human. 

1 1 9. A method of monitoring the efficacy of a therapeutic treatment of lung 

2 cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 
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5 (ii) determining flie level of a lung cancer-associated polypeptide in the 

6 biological sample by contacting the biological sample with an antibody, wherein the antibody 

7 specifically binds to a polypeptide encoded by a polynucleotide that selectively hybridizes to 

8 a sequence at least 80% identical to a sequence as shown m Tables lA-16, thereby 

9 monitoring the efficacy of the therapy. 

1 20. The method of claim 19, further comprising the step of: (iii) comparing 

2 the level of the lung cancer-associated polypq)tide to a level of the lung cancer-associated 

3 polypeptide in a biological sample from the patient prior to, or earlier in, the therapeutic 

4 treatment. 

21. The method of claim 19, wherein the patient is a human. 

22. An isolated nucleic acid molecule consisting of a polynucleotide 
sequence as shown in Tables 1 A-16. 

23. The nucleic acid molecule of claim 22, which is labeled. 

24. The nucleic acid of claim 23, wherein the label is a fluorescent label 

25. An expression vector comprising the nucleic acid of claim 22. 

26. A host cell comprising the expression vector of claim 25, 

27. An isolated polypeptide which is encoded by a nucleic acid molecule 
having polynucleotide sequence as shown in Tables lA-16. 

28. An antibody that specifically binds a polypeptide of claim 27. 

29. The antibody of claim 28, further conjugated to an effector component. 

30. The antibody of claim 29, wherein the effector component is a 
fluorescent label. 

3 1 . The antibody of claim 29, wherein the effector component is a 
radioisotope or a cytotoxic chemical. 

32. The antibody of claim 29, which is an antibody fiagment. 
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1 33. The antibody of claim 29, which is a humanized antibody 

1 34. A method of detecting a lung cancer cell in a biological sample from a 

2 patient, the method comprising contacting the biological sample with an antibody of claim 

3 28. 

1 35. The method of claun 34, wherein the antibody is further conjugated to 

2 an effector component. 

1 36. The method of claim 35, wherein the effector component is a 

2 fluorescent label. 

1 37. A method of detecting antibodies specific to lung cancer in a patient, 

2 the method comprising contacting a biological sample from the patient with a polypeptide 

3 encoded by a nucleic acid comprises a sequence from Tables 1 A-16. 

1 38. A method for identifying a compoimd that modulates a lung cancer- 

2 associated polypeptide, the method comprising the steps of: 

3 (i) contacting the compound with a limg cancer-associated polypeptide, the 

4 polypq)tide encoded by a polynucleotide that selectively hybridizes to a sequence at least 

5 80% identical to a sequence as shown in Tables 1 A-16; and 

6 (ii) determining the functional effect of the compound upon tiie polypeptide. 

1 39. The method of claim 38, wherein the functional effect is a physical 

2 effect. 

1 40. The method of claim 38, wherein the functional effect is a chemical 

2 effect. 

1 41. The method of claim 38, wherein the polypeptide is expressed in a 

2 eukaryotic host cell or cell membrane. 

1 42. The mettiod of claim 38, wherein the functional effect is detemiined by 

2 measiuing ligand binding to the polypeptide. 

1 43. The method of claim 38, wherein tiie polypeptide is recombinant 
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1 44. A method of inhibiting proliferation of a lung cancer-associated cell to 

2 treat lung cancer in a patient, the method comprising the step of administering to the subject a 

3 therapeutically effective amount of a compound identified using the method of claim 38. 

1 45. The method of claim 44, wherem the compound is an antibody. 

1 46. The method of claim 45, wherem the patient is a human. 

1 47, A drug screening assay comprising the steps of 

2 (i) administering a test compound to a mammal having lung cancer or a cell 

3 isolated therefrom; 

4 (ii) comparing the level of gene expression of a polynucleotide that selectively 

5 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables lA-16 in a 

6 treated cell or mammal with the level of gene expression of the polynucleotide in a control 

7 cell or mammal, wherein a test compound that modulates the level of expression of the 

8 polynucleotide is a candidate for the treatment of lung cancer. 

1 48. The assay of claim 47, wherein the control is a manmial with lung 

2 cancer or a cell therefrom that has not been treated with the test compound. 

1 49. The assay of claim 47, wherein the control is a nomial cell or mammal. 

1 50. A method for treating a mammal having lung cancer comprising 

2 administering a compound identified by the assay of claim 47. 

1 5 1 . A pharmaceutiPcal composition for treating a mammal having lung 

2 cancer, the composition comprising a compound identified by the assay of claim 47 and a 

3 physiologically acceptable excipient. 
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applicant wish&rio avoid or postpone publication, a notice of withdrawal of (he international apjdication, or of the pnority dum, aaist 
reach the International Bureau as prondded in Rules 90 bis,i and 90 tns,3t respectively, before the completion of the technical 
preparations for intcroational pabEcatSon. 

Within 19 months &Qm the prioriQ^ date, bot only in resped of some designated Oflxces, a dftmand for international preUminaiy 
examination must be filed if the s^jplicant wishes to posQ[>oxLe flie entiy into the oatiooal phase ontfl 30 months from the priority date 
(in some Ofi^tces even lacer); otherwise the applicant must, within 20 montiis trom the priority date, perform the prescribed acts for 
entry into the national phase before those dcsignflTwl O&ea. 

In respect of other designated Offices, the time limit of 30 months (or later) mU apply even if no demand is tiled witlun 19 months. 

See dte Annex to Foma PCXTIB/SOL and, for detaila about (he ^pltoable time Itouts, Office by OfEce, see the PCT ^pSicant's Cuide, 
Voliime n, Natiooal Chapters and the V/TPO lotemet site. 
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